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Abstract 

The  goal  of  this  research  was  to  identify  which  learning  curve  model  is  most 
accurate  when  applied  to  Defense  acquisition  programs.  Wright’s  original  learning  curve 
model  is  widely  accepted  and  used  within  Defense  acquisitions,  but  the  75+  year  old 
model  may  be  outdated.  This  study  compares  Wright’s  model  against  three  alternative 
learning  curve  models  using  total  lot  costs  for  the  F-15  C/D  &  E  programs:  the  Stanford- 
B  model,  the  DeJong  learning  formula,  and  the  S-Curve  model.  However,  the  results  of 
the  study  are  inconclusive.  Two  of  the  three  alternative  models,  the  DeJong  and  S-Curve, 
rely  on  the  use  of  an  incompressibility  factor  between  0  and  1  that  represents  the 
percentage  of  the  production  process  that  is  automated.  A  Bureau  of  Labor  Statistics 
report  identifies  that  percentage  as  very  low  but  does  not  give  an  exact  number. 

Therefore  assumptions  about  that  parameter  were  made.  When  the  factor  falls  between 
0.0  and  0.1  the  DeJong  and  S-Curve  models  appear  to  be  more  accurate;  when  the 
number  is  0.1  or  greater,  Wright’s  model  is  still  the  most  accurate.  Further  research 
should  be  targeted  at  the  exact  value  of  this  factor  to  validate  this,  or  future,  comparative 
studies. 
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A  COMPARATIVE  STUDY  OF  LEARNING  CURVE  MODELS  IN  DEFENSE 

AIRFRAME  COST  ESTIMATING 


I.  Introduction 


General  Issue 

In  2008,  the  United  States’  economy  took  a  plunge  that  affected  every  industry 
from  the  real-estate  market  to  automobile  manufacturers.  This  crash  led  to  tightened 
budgets  throughout  the  country  and  many  companies  looked  to  operate  more  efficiently 
with  less  capital.  That  economic  turmoil  is  reflected  in  the  Department  of  Defense  (DoD) 
through  funding  cuts  and  shrinking  budgets  at  every  level.  The  ten  year  sequestration 
period  approved  by  Congress  with  the  Budget  Control  Act  of  201 1  places  emphasis  on 
commanders  and  managers  using  funds  efficiently.  On  a  micro  level,  the  scrutiny  of 
program  cost  estimates  places  more  pressure  on  estimators  than  ever  before.  Due  to  the 
fact  that  sequestration  effects  and  cuts  will  continue  for  nearly  a  decade,  cost  estimators 
and  the  accuracy  of  acquisition  cost  estimates  play  a  more  pivotal  role  than  ever  before  in 
acquisition  programs.  Cost  estimates  are  no  longer  just  a  box  to  check  at  milestone 
reviews;  they  now  provide  leverage  for  managers  and  valuable  information  in  balancing 
budgets.  One  way  to  assist  cost  estimators  is  to  provide  them  with  the  most  current  and 
appropriate  tools  in  order  to  calculate  the  most  accurate  and  reliable  estimate;  however, 
conventional  learning  curve  methodology  has  been  in  practice  since  the  pre-WWII  build 
up  in  the  inl930s,  but  those  historical  methods  may  be  outdated  in  today’s  fast-paced, 
technological  environment. 

Over  the  past  two  decades,  a  new  methodology  rooted  in  the  concept  of  forgetting 
curves  has  emerged,  and  may  provide  a  more  accurate  tool  for  assessing  learning  curves. 
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Forgetting  is  becoming  more  widely  accepted,  but  its  application  to  learning  curves  in 
manufacturing  is  scarce.  This  thesis  will  examine  the  question  of  whether  more  accurate 
learning  curve  models  exist  that  could  be  applied  to  cost  estimates  within  large 
acquisition  programs.  Chapter  I  of  the  thesis  will  provide  a  background  of  modem 
learning  curve  methodology  followed  by  an  explanation  of  forgetting  and  a  description  of 
the  problem  to  be  investigated.  Chapter  I  will  also  include  a  discussion  of  the 
assumptions  made  in  this  study  and  a  review  of  the  research  methodology  that  will  be 
used  to  test  the  theory  followed  by  a  description  of  the  data  sources  collected.  The 
conclusion  will  provide  a  synopsis  of  the  points  covered  in  this  chapter  as  well  as  a 
blueprint  for  the  subsequent  chapters  of  this  thesis. 

Background 

The  concept  of  learning  and  the  application  of  learning  curves  in  manufacturing 
has  been  in  practical  use  since  the  height  of  the  pre-WWII  build  up  in  the  late  1930s. 
From  industrial  manufacturing,  to  avionics  software,  the  footprint  of  the  learning 
phenomenon  has  been  witnessed  throughout  both  the  public  and  private  business  sectors. 
Early  applications  of  learning  curves  in  aircraft  date  back  to  T.P.  Wright  in  1936  and  his 
report  while  at  Curtiss-Wright  Corporation  (Badiru,  Elshaw,  &  Mack,  2013).  Learning 
curve  methodology  has  undergone  an  evolution  over  the  seventy  plus  years  since  Thomas 
Wright’s  report,  and  it  has  adopted  other  names  along  the  way  such  as  cost  improvement 
curve  or  experience  curve;  however,  the  theory  has  remained  relatively  unchanged 
despite  drastic  changes  in  manufacturing  and  technology.  The  learning  concept  itself  is 
based  on  the  theory  that  as  a  worker  performs  a  task  multiple  times,  he  or  she  will  require 
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less  and  less  time  to  complete  the  same  task  due  to  familiarity  with  the  process.  A 
learning  curve  is  a  mathematical  representation  of  this  theory  which  states  that  as  the 
quantity  doubles  the  worker’s  performance  will  improve  at  a  constant  rate,  and  is 
represented  in  Equation  1.1  (Wright  1936).  Wright’s  model  has  many  different  forms,  but 
the  basic  architecture  remains  the  same: 

y  —  axb  (1) 

In  this  model,  y  represents  the  estimated  production  hours  (or  cost)  for  the  Xth  unit 
produced  where  a  is  the  production  hours  (or  cost)  of  the  theoretical  first  unit  produced, 
and  b  is  a  factor  of  the  learning  rate  which  will  be  explained  in  greater  detail  in  the 
Literature  Review. 

Wright’s  model  shown  above  has  been  widely  accepted  and  used  in 
manufacturing  for  years;  however,  in  recent  years  a  contradicting  phenomenon  known  as 
forgetting  has  been  recognized.  A  2013  Journal  of  Aviation  and  Aerospace  Perspective 
article  titled  “Half-Life  Learning  Curve  Computations  for  Airframe  Life-cycle  Costing  of 
Composite  Manufacturing”  explains  the  concept  of  forgetting  in  learning  curves. 
Throughout  the  article,  Badiru  et  al.  introduce  forgetting  and  identify  learning  curve 
models  that  account  for  forgetting  by  varying  the  rate  of  learning.  The  authors  state,  “It 
has  been  shown  that  workers  experience  forgetfulness  or  decline  in  performance  even 
while  they  are  making  progress  along  a  learning  curve  (Badiru  et  al,  2013).”  The  article 
continues  to  add,  “contemporary  learning  curves  have  attempted  to  incorporate  forgetful 
components  into  learning  curves  (Badiru  et  al,  2013).”  The  forgetting  concept  and  the 
possible  use  of  these  models  are  the  groundwork  for  this  research  and  leads  to  the 
question  of  whether  contemporary  learning  curve  models  that  ignore  this  phenomenon 
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are  outdated.  This  thesis  will  attempt  to  demonstrate  that  modem  learning  curve  models 
which  account  for  forgetting  are  more  accurate  in  predicting  actual  manufacturing  hours 
(or  relative  costs)  than  conventional  models.  Subsequent  chapters  of  this  thesis  will 
examine  such  questions  in  an  effort  to  identify  possible  areas  of  improvement  for 
learning  curve  estimation. 

Learning  curves  are  widely-used  and  even  expected  throughout  DoD  cost 
estimates.  This  thesis  does  not  intend  to  discredit  the  use  of  learning  curves,  but  rather 
determine  if  the  commonly-used  models  can  be  improved  upon  throughout  acquisition 
programs.  Air  Force  guidance  on  learning  curve  theory  and  application  primarily 
originates  from  the  Air  Force  Cost  Analysis  Handbook  (AFCAH)  Chapter  8  and  the  DoD 
Basic  Cost  Estimating  Guidebook  (BCE)  Chapter  17.  These  two  resources  primarily 
focus  on  two  learning  curve  theories:  unit  theory  and  cumulative  average  theory.  Unit 
theory  focuses  on  the  cost  of  a  given  unit  and  is  expressed  with  the  same  equation  shown 
in  Equation  1;  “The  unit  theory  states  that  as  the  quantity  of  units  doubles,  the  unit  cost 
decreases  by  a  constant  percentage”  (BCE,  2007). 

Conversely,  the  cumulate  average  theory  focuses  on  the  average  cost  of  all  units 
produced  up  to  a  certain  point  in  production.  Cumulative  average  theory  is  often 
attributed  to  Wright  himself  and  his  1936  article  “Factors  Affecting  the  Cost  of 
Airplanes”  in  which  he  states,  “as  the  total  quantity  of  units  produced  doubles,  the 
cumulative  average  cost  decreases  by  a  constant  percentage”  (Wright,  1936).  This 
equation  is  essentially  the  same  equation  as  the  unit  theory  equation,  but  it  differs  in  that 
y  and  x  represents  cumulative  average  costs  and  unit  values  respectively.  These  are  the 
two  primary  methods  currently  accepted  in  DoD  acquisition  programs. 
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As  an  example,  assume  an  avionics  manufacturer  wants  to  produce  eight  units  of 


given  aircraft  component.  The  company  believes  the  first  unit  will  cost  $100,000  and  the 
plant  will  experience  an  80%  learning  curve.  The  chart  below  in  Table  1  provides 
estimates  of  both  the  unit  and  cumulative  average  (Cumm  Avg)  theories.  The  table 
shows  that  the  estimate  for  a  given  unit  will  always  be  higher  with  the  cumulative 
average  theory  because  it  takes  into  account  all  of  the  previous  units  produced  at  a  higher 
cost.  In  DoD  cost  estimating,  cumulative  average  theory  is  considered  conservative,  but 
it  can  also  provide  more  consistent  analysis  of  the  data  due  to  the  fact  that  actual  costs  are 
often  reported  in  annual  lot  totals  rather  than  individual  unit  costs. 


Table  1:  80%  Learning  Curve  Estimates  (in  $K) 


Unit 

Unit  Theory 

Cumm  Avg  Theory 

1 

$ 

100.00 

$ 

100.00 

2 

$ 

80.00 

$ 

90.00 

3 

$ 

70.21 

$ 

83.40 

4 

$ 

64.00 

$ 

78.55 

5 

$ 

59.56 

$ 

74.75 

6 

$ 

56.17 

$ 

71.66 

7 

$ 

53.45 

$ 

69.06 

8 

$ 

51.20 

$ 

66.82 

Problem  Statement/Research  Objectives 

Both  unit  and  cumulative  average  theories  are  used  by  cost  estimators  to  better 
forecast  total  system  costs,  but  in  this  fiscally  constrained  economic  period,  it  may  be 
time  for  the  DoD  to  examine  more  modem  methods  in  its  forecasting  techniques.  This 
thesis  will  attempt  to  answer  the  question  of  whether  DoD  cost  estimates  can  be 
significantly  improved  upon  with  the  application  of  alternative  learning  curve  models. 
Current  DoD  models  assume  a  constant  rate  of  learning,  while  many  of  the  alterative 
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models  incorporate  some  aspects  of  forgetting  and  thus  a  declining  learning  rate.  With 
that  research  focus,  the  following  investigative  questions  are  presented: 

1-  Can  any  of  the  modem  learning  curve  models  be  applied  to  current  DoD 
aircraft  cost  estimating  procedures?  If  so,  which  ones? 

2-  Are  learning  curve  models  that  account  for  forgetting  more  accurate  than  the 
conventional  learning  curve  model  commonly  used  today?  If  so,  which  ones? 

3-  Which  learning  curve  model  is  most  accurate  at  predicting  the  actual  cost  of 
an  acquisition  system? 

Subsequent  chapters  of  this  thesis  will  attempt  to  answer  these  questions  as  well  as 
outline  the  research  findings  that  apply  to  each.  These  results  could  prove  to  be 
paramount  in  an  ongoing  attempt  to  increase  estimate  accuracy  and  improve  the 
efficiency  of  DoD  acquisition  spending. 

Methodology 

Once  the  data  are  collected  and  standardized  for  this  research,  the  analysis  should 
be  straightforward  for  readers  to  follow.  Each  of  the  three  models  identified  in  the 
screening  process  for  this  study  will  be  used  to  predict  total  airframe  lot  costs  for  the  F-15 
C/D  &  E.  The  three  models  and  their  formulations  will  be  explained  in-depth  in  Chapters 
II  and  III.  Each  of  the  predicted  airframe  lot  costs  for  the  three  alternative  models  will 
then  be  compared  to  Wright’s  model  and  the  actual  lot  costs  to  calculate  the  error,  also 
known  as  the  residual.  The  percent  error  for  each  of  the  models  will  be  compared  to  the 
other  models  using  an  Analysis  of  Variance  (ANOVA)  and  Dunnett  means  test,  which 
will  each  be  explained  in  Chapter  III.  A  significance  value  or  alpha  (a)  of  .05  will  be 
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used  to  determine  whether  at  least  one  of  the  models  has  a  mean  residual  value  different 


from  the  rest. 

Implications 

If  significant  results  are  discovered  as  stated  above,  the  final  piece  of  analysis  will 
be  to  determine  which  model  is  the  best  predictor  of  actual  production  costs.  One  simple 
way  to  compare  the  models  will  be  to  compare  which  model  has  the  least  amount  of 
standard  error  expressed  as  a  percentage.  The  smallest  percent  error  will  reflect  the  most 
accurate  model.  As  a  result,  if  it  is  supported  that  one  of  the  modern  learning  curve 
models  is  a  more  accurate  predictor  than  the  conventional  method  used  today,  then  those 
results  could  be  presented  for  further  analysis  and  potentially  enacted  into  future  Air 
Force  and  DoD  guidance,  or  at  a  minimum  provide  a  proxy  for  further  research. 

Assumptions/Scope 

One  of  the  greatest  challenges  of  this  research  will  be  the  application  of  variables 
used  for  the  more  modern  learning  curve  formulas.  Several  of  these  formulas  use 
constants  or  other  learning  factors  that  allow  the  models  to  compensate  for  the  loss  of 
learning.  Variables  such  as  previous  experience  units  and  incompressibility  factor,  which 
will  be  explained  in  Chapter  II,  must  be  correctly  predicted  in  order  for  the  models  to  be 
accurate.  However,  many  of  those  factors  will  be  estimated  based  on  certain  criteria  that 
is  extracted  from  the  data  set  or  calculated  given  other  values  in  the  formula.  Constants 
and  factors  used  in  the  models  will  be  included  based  on  the  data  provided  and  on 
reasonable  assumptions  rooted  in  expert  opinion.  A  further  description  of  these  factors 
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and  the  assumptions  made  to  apply  the  formulas  can  be  found  in  Chapter  III  of  this 
report. 

This  research  contains  a  fairly  narrow  scope  and  focus  solely  on  fighter  aircraft 
costs  within  the  Air  Force,  specifically  the  F-15.  Analysis  will  focus  on  the  airframe 
costs  of  the  Air  Force  F-15  A-E  spread  over  a  17  year  period.  This  scope  was  narrowed 
by  the  availability  and  applicability  of  data,  which  will  be  detailed  in  Chapter  III. 
Application  to  additional  platform  types  such  as  cargo  aircraft  or  bombers  and  even 
different  system  types  such  as  ships,  ground  vehicles,  or  satellites  is  an  area  for  potential 
follow  on  research. 

Conclusion 

The  primary  goal  of  this  thesis  is  to  address  the  research  question  of  whether  the 
application  of  modern  learning  curve  models  that  account  for  performance  decay  predict 
actual  production  costs  more  accurately  than  the  conventional  models  often  used  today. 
The  data  analysis  involved  will  statistically  compare  the  accuracy  of  three  selected 
learning  curve  models  against  the  conventional  model  used  throughout  DoD.  Significant 
results  and  the  identification  of  the  most  accurate  model  will  provide  a  stepping-stone  to 
possible  methodological  changes  within  the  Air  Force  and  DoD  and  provide  increased 
accuracy  of  acquisition  costs  estimates. 

The  next  chapter  will  provide  a  more  in-depth  look  into  the  literature  surrounding 
the  concepts  of  learning  and  unlearning  in  manufacturing  both  inside  and  outside  the 
government.  Chapter  II  will  also  examine  current  DoD  and  Air  Force  guidance  on 
learning  curve  methodology  and  application  of  learning  curves  in  cost  estimates,  as  well 
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as  provide  in-depth  descriptions  of  the  three  models  presented.  Chapter  III  will  step 
through  the  methodology  used  to  test  the  investigative  questions  as  well  as  provide 
details  into  the  data  sets  collected  for  the  study.  Chapter  III  will  also  provide  analysis  of 
the  data  set  needed  for  the  application  of  the  alternative  learning  models.  Chapter  IV  will 
contain  the  data  results  compiled  from  the  methods  described  in  Chapter  III  including 
relevant  charts  and  graphs  from  the  analysis.  The  thesis  will  conclude  with  Chapter  V, 
which  will  contain  a  discussion  of  the  significance  of  the  results  as  well  as  the  potential 
impact  of  the  findings  on  learning  curve  methodology  both  inside  and  outside  of  DoD. 
Chapter  V  will  also  include  areas  that  require  additional  research,  limitations  to  this 
study,  and  possible  follow-on  thesis  topics. 
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II.  Literature  Review 


Introduction 

Very  few  things  in  business  are  constant;  performance  is  no  exception  to  that 
uncertainty.  Performance  varies  externally  from  worker  to  worker,  division  to  division, 
and  internally  from  day  to  day,  season  to  season,  or  year  to  year.  Take  for  instance  the 
production  of  an  automobile.  While  the  process  and  parts  are  always  the  same,  a  savvy 
car  buyer  may  want  to  avoid  cars  that  were  built  on  a  Monday  or  Friday.  The  worker  and 
even  the  entire  assembly  line  may  suffer  a  loss  in  performance  due  to  working  at  the 
beginning  or  end  of  the  week.  This  concept  of  uneven  and  even  degrading  performance 
over  time  is  the  root  of  forgetting  theory  and  the  foundation  for  this  research. 

The  Budget  Control  Act  of  2011,  which  calls  for  a  $1.5  trillion  deficit  reduction 
over  the  next  10  years,  has  created  a  fiscally  constrained  environment  in  which 
competition  for  congressional  funding  is  higher  than  ever  before.  On  an  organizational 
level,  DoD  acquisition  programs  have  seen  budget  cuts  up  to  ten  percent,  changes  in 
acquisition  schedule,  reduction  in  the  number  of  systems  purchased,  and  an  increased 
scrutiny  over  cost  estimates.  Adopting  models  and  theories  that  potentially  increase  cost 
estimating  accuracy  can  prove  beneficial  to  organizations  and  provide  leverage  for 
leaders  defending  their  budget  position. 

Learning  curve  theory  has  been  debated  and  modified  for  decades;  however,  the 
theory  and  its  application  to  Department  of  Defense  (DoD)  cost  estimating  has  remained 
relatively  unchanged  and  has  not  readily  adapted  to  current  industrial  theories  or  trends. 
While  many  unanimously  agree  with  the  psychological  effects  associated  with  learning 
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and  process  improvement,  the  application  of  learning  toward  manufacturing  and 
production  is  debated.  In  recent  years,  several  learning  curve  models  have  attempted  to 
capture  the  recently-identified  phenomenon  of  forgetting,  in  which  a  worker’s 
performance  begins  to  decrease  over  time. 

This  chapter  will  deliver  an  in-depth  review  of  present  day  learning  theories  and 
modem  forgetting  curve  methodology  including  the  models  that  attempt  to  relate  the  two 
together.  The  theory  and  methodology  will  be  followed  by  a  description  of  the  issue  and 
provide  a  look  into  current  DoD  learning  methodology  and  application.  This  chapter  will 
examine  any  prior  research  in  the  area,  look  at  similar  approaches  found  in  the  literature, 
as  well  as  provide  a  description  of  other  appropriate  methodologies  and  applications 
adopted  over  the  past  two  decades,  and  conclude  with  obstacles  and  limitations  to  the 
literature  and  research. 

Theory  Review 

Learning  Curves 

Learning  curves  started  being  used  by  practitioners  in  the  manufacturing  world  in 
the  late  1930s.  At  the  height  of  the  pre -World  War  II  build-up,  the  importance  of  aircraft 
production  costs  was  realized  to  be  equally  as  important  as  developing  and  producing  the 
aircraft  themselves.  T.  P.  Wright  (1936)  first  identified  the  existence  of  the  learning 
relationship.  He  correctly  theorized  that  as  a  worker  performs  the  same  task  multiple 
times,  the  time  required  to  complete  that  task  will  decrease  at  a  constant  rate.  The 
workers  are  learning  from  previous  experience  and  thus  becoming  more  efficient  in 
completing  the  task.  Wright  also  identified  the  80  percent  learning  effect  in  aircraft 
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production.  He  believed  that  organizations  would  observe  a  learning  rate  of  80%,  or  a 
20%  production  improvement,  as  the  number  of  units  produced  doubled  (Wright,  1936). 
This  rule  would  serve  as  a  suggested  standard,  but  has  been  changed  and  modified  over 
time  to  fit  different  industries.  A  graphical  representation  of  Wright’s  80%  learning 
curve  where  the  first  unit  costs  $100,000  can  be  seen  below  in  Figure  1.  As  you  can  see 
in  the  graph,  when  the  number  of  units  produced  doubles  (from  1  to  2,  2  to  4,  4  to  8  and 
so  on)  the  average  cost  to  produce  the  unit  is  reduced  by  approximately  20%. 
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Figure  1:  Wright’s  80%  Learning  Curve  Example 


This  classical  learning  curve  model,  often  referred  to  as  Wright’s  Learning 
Model,  gives  mathematical  representations  of  Wright’s  basic  learning  theory.  The  model, 
shown  in  Equation  1  below,  follows  the  assumption  that  as  the  quantity  produced 
doubles,  the  cost  will  decrease  at  a  constant  rate. 


12 


Where 


(1) 


y  —  axb 

y  =  the  cumulative  average  time  (or  related  cost)  after 
producing  jc  units 

a  =  hours  required  to  product  (theoretical)  first  unit 
x  =  cumulative  unit  number 
b  =  log  R/ log  2  =  learning  index 
R  =  learning  rate  (a  decimal) 

For  the  remaining  sections  of  this  chapter,  Wright’s  model  will  be  referred  to  in  its  more 

modem  form  of  Tx  —  T1x~b .  This  model  can  also  be  expressed  linearly  by  transforming 

the  equation  through  simple  algebra.  This  transformation  to  a  linear  relationship 

becomes  useful  in  regression  analysis,  in  which  practitioners  attempt  to  fit  a  straight  line 

to  the  transformed  data.  The  log-linear  form  of  Wright’s  equation,  seen  in  Equation  2, 

can  be  derived  through  simple  logarithmic  algebra: 

In  y  =  In  a  +  b\n  x  (2) 

Using  the  log-linear  form  of  the  equation,  the  constant  learning  curve  rate  can  be  seen  in 
linear  form: 


Log-Linear  Learning  Curve 
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Figure  2:  Log- Linear  Learning  Curve  Example 
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The  graph  shows  that  Wright’s  Learning  Curve  assumes  a  constant  learning  rate  over 
time  illustrated  by  the  straight  line.  At  any  point  in  production,  the  learning  rate,  and  thus 
performance,  are  constant. 

J.  R.  Crawford  (1944)  adopted  a  similar  learning  curve  approach  in  the  individual 
unit  model  that  he  introduced  in  a  training  manual  at  Lockheed  Martin.  Crawford’s 
model  uses  the  same  basic  formula  as  Wright’s  model,  but  attempts  to  estimate  individual 
times  (or  related  cost)  to  produce  a  given  unit  by  changing  which  variables  are  input  into 
the  model.  An  example  of  this  model  can  be  seen  in  Figure  3  below.  This  model  proved 
to  be  beneficial  because  it  can  be  applied  to  individual  workers  or  projects  rather  than  to 
the  organization  as  a  whole  (Jaber,  2011). 
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Figure  3:  Unit  Theory  Learning  Curve  Example 


Both  unit  theory  and  cumulative  average  approaches  are  used  in  acquisition  cost 
estimating  depending  on  the  amount  and  validity  of  historical  program  data.  However, 
contractor  reports  often  come  in  the  form  of  lots.  This  form  of  data  is  usually  more 
advantageous  to  using  a  cumulative  average  learning  curve.  The  AFCAH  illustrates  how 
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such  data  can  be  used  as  a  lot  average  in  the  cumulative  average  learning  curve  theory 
rather  than  finding  a  theoretical  lot  midpoint  as  with  the  unit  theory. 

[A]pply  the  Cum  Avg  formulation  to  contractor  lot  information,  add  the 
hours/costs  for  a  given  lot  to  the  hours/costs  of  all  previous  lots.  The  hour/cost 
plot  value  (Y  axis)  of  a  given  lot  is  the  total  hours/costs  through  that  lot  divided 
by  the  last  unit  number  of  that  lot,  while  the  unit  plot  point  (X  axis)  is  the  last  unit 
number  of  that  lot.  Lot  midpoints  are  not  used  with  the  Cum  Avg  formulation 
(AFCAH,  2007). 

Furthermore,  Hu  and  Smith  (2013)  identify  a  method  for  plotting  and  predicting 
learning  curves  using  lot  data.  “If  the  cumulative  average  costs  for  all  consecutive  lots 
are  present,  then  the  direct  approach  can  be  applied  to  the  lot  data  with  the  last  unit  in  the 
lot  as  the  lot  plot  point  (LPP).”  This  LPP  is  the  same  as  unit  plot  point  described  in  the 
AFCAH  and  provides  a  means  for  plotting  lot  data  against  individual  units  (on  the  X  axis) 
in  order  to  determine  the  learning  parameters.  Hu  and  Smith  describe  this  process  saying, 
“Tl,  b,  and  other  exponents  can  be  obtained  directly  from  the  ordinary  least  squares 
(OLS)  method  by  regressing  [cumulative  average  costs]  vs.  cumulative  quantities”  (Hu  & 
Smith,  2013).  The  application  of  this  process  to  the  F-15  data  will  be  described  in  greater 
detail  in  Chapter  III. 

Since  Wright’s  initial  theory,  several  other  models  have  been  adopted  in  learning 
curve  literature.  One  of  the  earliest  modifications  to  the  learning  curve  model  came 
along  with  introduction  of  the  Stanford-B  model  shown  below  in  Equation  3. 
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Ti=T1(x  +  B) 


(3) 


-b 


Where: 

T{  =  the  cumulative  average  time  (or  related  cost)  after 
producing  x  units 

7\  =  hours  required  to  product  (theoretical)  first  unit 
x  =  cumulative  unit  number 
b  =  log  R/ log  2  =  learning  index 

B  =  equivalent  experience  units  (a  constant);  slope  of  the 
asymptote  of  the  curve. 

(Yelle  1979) 

This  model  is  first  attributed  to  Louis  E.  Yelle  (1979)  during  a  government  funded 
research  initiative  at  Stanford.  It  introduces  the  equivalent  experience  unit  parameter  to 
Wright’s  original  equation.  This  parameter,  represented  by  B ,  is  a  constant  from  zero  to 
ten  accounting  for  the  number  of  units  produced  prior  to  start  of  production  of  the  first 
unit  and  is  the  slope  of  the  asymptote  of  the  learning  curve.  If  this  factor  is  zero,  the 
model  reverts  back  to  Wright’s  original  learning  model  shown  earlier  in  Figure  1  (Badiru 
2012).  Conversely,  if  the  factor  is  ten,  the  effects  of  learning  will  begin  at  the  eleventh 
unit  and  the  decrease  in  performance  will  occur  much  sooner  causing  the  learning  curve 
slope  to  flatten  quickly.  The  effect  of  a  high  B  constant  on  the  same  data  set  used  earlier 
can  be  seen  below  is  Figure  4,  which  assumes  that  10  units  have  been  produced  on  a 
previous  contract.  The  prior  experience  parameter  allows  the  formula  to  account  for  prior 
learning  and  essentially  continue  learning  from  some  previous  point  in  time  rather  than 
starting  the  learning  process  over  from  zero.  Chapter  III  will  address  the  use  of  the 
equivalent  experience  unit  parameter  in  this  study  and  how  those  values  were  determined 
for  each  of  the  models. 
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Figure  4:  Stanford-B  Model  Example  with  B=10 

When  the  Stanford-B  model  is  graphed  in  log-linear  form  as  shown  in  Figure  5,  one  can 
see  a  slow  build  up  in  performance  that  is  attributed  to  the  production  of  prior  experience 
units. 


Figure  5:  Stanford-B  Model  Example  in  Fog-Finear  Form 

Another  variation  of  learning  curve  models  is  DeJong’s  Fearning  Formula. 
DeJong’s  model,  seen  below  in  Equation  4,  is  another  derivation  from  Wright’s  original 
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function  in  which  the  incompressibility  factor  is  introduced.  Represented  by  the  constant 
M,  this  factor  represents  the  relationship  between  manual  processes  and  machine- 
dominated  processes.  Incompressibility  factor  is  a  constant  between  zero  and  one  in 
which  a  value  of  zero  implies  a  fully  manual  operation  and  a  value  of  one  denotes  a 
completely  machine  dominated  operation  (Badiru  et.  al,  2013). 

Tx  =  T1[M  +  (l-M)x~b]  (4) 

Where: 

7)  =  the  cumulative  average  time  (or  related  cost)  after 
producing  a;  units 

7\  =  hours  required  to  product  (theoretical)  first  unit 
x  =  cumulative  unit  number 
b  =  log  jR/log  2  =  learning  index 
M  =  incompressibility  factor  (a  constant) 

Wright’s  original  model,  which  inherently  assumes  an  incompressibility  factor  of 

zero,  fails  to  account  for  the  advances  in  manufacturing  technology  that  drive  a  major 

percentage  of  the  production  industry.  A  graph  with  an  incompressibility  factor  of  0.70  is 

shown  in  Figure  6  to  illustrate  the  difference  in  the  models. 
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Figure  6:  DeJong  Learning  Curve  Example  with  M  =  0.70 
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As  the  graph  demonstrates,  a  high  incompressibility  factor  reduces  the  effects  of  learning 
and  causes  a  much  quicker  flattening  of  the  curve.  Figure  7  below  shows  the  log-linear 
graph  from  the  model,  in  which  the  loss  of  learning  and  decrease  in  performance  can  be 
seen  over  time. 

Production  of  something  as  complex  as  a  military  aircraft,  and  a  fighter  aircraft  in 
particular,  will  likely  fall  much  closer  to  zero  than  one  on  that  scale  due  to  the 
specialization  needed  in  the  production  process  similar  to  that  of  a  high  end  sports  car. 
However,  there  is  no  literature  on  the  exact  value  of  that  figure  for  aircraft  production  and 
may  vary  from  company  to  company.  Therefore,  this  research  will  assume  a  highly 
manual  process  and  look  at  a  range  of  incompressibility  factors  (from  0.0  to  0.2)  to  see  if 
changes  in  M  has  an  effect  on  the  results.  Explanation  of  how  the  factors  for  this  study 
were  determined  can  be  found  in  the  methodology  section  of  Chapter  III. 
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Figure  7:  DeJong  Model  Example  in  Log-Linear  Form 
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One  of  the  potential  weaknesses  of  the  two  previous  models  is  that  the  Stanford-B  model 
does  not  account  for  incompressibility,  and  DeJong’s  model  does  not  account  for 
previous  units  produced. 

The  S -Curve  model,  however,  accounts  for  both  of  these  factors  together.  Carr 
(1946)  believed  that  there  was  an  error  in  Wright’s  constant  learning  assumption  and 
hypothesized  that  the  effects  of  learning  and  thus  performance  followed  the  S-Curve 
shape  seen  below  in  Figure  8. 


Cumulative  unit  number 


Figure  8:  Carr’s  (1946)  S-shaped  Learning  Curve 

The  S-Curve  model  assumes  a  gradual  build  up  in  the  early  stages  of  production  followed 
by  a  period  of  peak  performance.  This  build  up  is  typically  attributed  to  personnel  and 
procedural  changes  as  well  as  time  needed  for  new  machinery  set-ups  that  occur  early  in 
the  production  process.  Using  the  theory  hypothesized  by  Carr,  Towell  and  Cherrington 
(1994)  developed  a  model  that  followed  the  S  shaped  pattern.  The  S-Curve  model, 
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shown  below  in  Equation  5,  assumes  that  learning  takes  the  S-shaped  curve  often  seen  in 


a  cumulative  normal  distribution. 

At  the  top  of  the  curve,  from  points  A  to  B,  there  is  a  slow  build  up  period  before 
the  worker/  organization  can  be  fully  proficient  in  accomplishing  the  task.  Then,  from 
points  B  to  C,  there  is  a  gradual  improvement  in  production  time  due  to  repetition  of  the 
process.  The  trailing  off  effect,  from  points  C  to  D,  is  referred  to  as  the  slope  of 
diminishing  returns  and  is  similar  to  the  trends  seen  on  the  tail  of  the  log-linear  form  of 
the  DeJong  Model;  after  a  worker  or  organization  has  reached  maximum  efficiency,  he  or 
she  will  experience  forgetting  and  other  inefficiencies  in  their  production 


TX  =  T1  +  M(x  +  Byb  (5) 

Where: 

Ti  =  the  cumulative  average  time  (or  related  cost)  after 
producing  x  units 

7\  =  hours  required  to  product  (theoretical)  first  unit 
x  =  cumulative  unit  number 
b  =  log  ///log  2  =  learning  index 
M  =  incompressibility  factor  (a  constant) 

B  =  equivalent  experience  units  (a  constant) 

Badiru  et  al  describe  the  slope  of  diminishing  returns  with  the  following  scenario: 

[CJonsider  when  a  worker  begins  learning  a  new  task.  The  individual  is  slow 
initially  at  the  tail  end  of  the  S-Curve,  but  the  rate  of  learning  increases  as  time 
goes  on,  with  additional  repetitions.  This  helps  the  worker  to  climb  the  steep- 
slope  segment  of  the  S-Curve  very  rapidly.  At  the  top  of  the  slope,  the  worker  is 
classified  as  being  proficient  with  the  learned  task.  From  then  on,  even  if  the 
worker  puts  much  effort  into  improving  upon  the  task,  the  resultant  learning  will 
not  be  proportional  to  the  effort  expended.  (Badiru  et  al,  2013) 
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This  concept  captures  the  impact  of  forgetting.  Even  as  the  worker  is  progressing  along 
the  learning  curve,  forgetting  will  eventually  take  place.  Use  of  this  model  in  research 
may  provide  a  more  accurate  look  at  the  actual  learning  and  forgetting  that  occurs  over  a 
production  life-cycle. 

Several  other  learning  models  have  been  identified  in  other  literature.  Models 
such  as  Levy’s  adaptation  function  which  uses  a  k  constant  to  level  off  the  learning  curve, 
Knecht’s  upturn  model  that  uses  a  c  constant  to  reverse  the  diection  of  the  learning  curve 
at  higher  cumulative  volumes,  Glover’s  learning  formula  which  applies  individual 
learning  results  at  an  organizational  level, 

Pagel’s  Exponential  Function  which  uses  parameters  based  on  empirical  analysis, 
and  the  Cobb-Douglas  model  which  applies  independent  variables  to  the  learning 
function  have  all  been  used  and  applied  in  other  areas  of  research  (Kar  2007).  The  three 
models  that  will  be  used  in  this  research  will  be  the  Stanford-B  Model,  DeJong’s 
Learning  Formula,  and  the  S-Curve  Model.  A  graphical  comparison  of  these  models  is 
shown  below  in  Figure  10.  Several  of  the  other  models  require  additional  information 
and  data  that  is  not  available.  Also,  the  three  models  listed  have  similar  parameters  that 
can  be  easily  identified  or  assumed  making  them  more  useful  to  cost  estimators  who  put 
them  to  practical  use.  The  goal  is  to  make  the  estimator’s  job  easier,  not  complicate  it 
with  a  series  of  equations  that  cannot  easily  be  explained  to  decision  makers.  The 
following  section  will  investigate  some  of  the  literature  regarding  forgetting  theory  and 
some  of  the  modern  forgetting  models  and  how  they  are  used. 
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Figure  9:  Learning  Curve  Models  (Badiru  1992) 

Forgetting  and  Forgetting  Curve  Models 

Learning  and  unlearning  often  take  place  simultaneously  in  manufacturing  and 
production  environments.  Learning  has  been  recognized  and  modeled  in  these 
environments,  but  the  unlearning,  or  forgetting,  aspect  is  often  neglected.  Forgetting 
simply  refers  to  the  concept  that  workers  will  inevitably  see  a  decline  in  performance 
(from  many  potential  sources)  while  still  theoretically  moving  along  the  learning  curve 
(Badiru  1995).  Badiru  (2012)  also  expresses  this  concept  visually  in  a  chart  that  displays 
a  worker’s  performance  over  time  shown  below  in  Figure  10  below.  Unlike  the  constant 
rate  of  learning  first  proposed  in  Wright’s  original  model,  this  figure  illustrates  that  a 
worker  or  organization  will  experience  intermittent  periods  of  forgetting  that  cause  the 
performance  to  be  lower  than  anticipated 
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Figure  10:  Effects  of  Forgetting  on  Performance 


.  This  decline  in  performance  leads  to  longer  production  times  and  thus  higher  costs  than 
estimated.  This  assumption  may  be  one  of  many  reasons  that  DoD  cost  estimates  have 
been  inaccurate  in  the  past.  Understanding  the  forgetting  phenomenon  and  successfully 
applying  it  to  Air  Force  and  DoD  acquisition  programs  can  be  an  integral  step  in 
improved  estimate  accuracy. 

In  recent  decades,  several  learning  curve  models  have  been  applied  to  a  number  of 
manufacturing  and  production  settings.  Increasingly,  contemporary  models  have 
attempted  to  incorporate  the  forgetting  concept  to  measure  the  impact  of  forgetting  on 
overall  performance.  Jaber  and  Sikstrom  (2004)  identify  the  potential  for  forgetting 
curve  research. 

Fearning  and  forgetting  processes  have  received  increasing  attention  by 
researchers  and  practitioners  in  the  field  of  production  and  operations 
management  for  the  last  two  decades.  A  handful  of  theoretical,  experimental  and 
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empirical  mathematical  forgetting  models  have  been  developed,  with  no 
unanimous  agreement  among  researchers  and  practitioners  on  the  form  of  the 
forgetting  curve. 

One  potential  cause  for  forgetting  is  production  breaks.  Nembhard  and  Osothslip 
(2001)  performed  a  comparative  study  of  14  different  forgetting  curve  models  designed 
to  account  for  production  breaks.  The  study  tested  the  models  against  the  three  pre¬ 
determined  criteria  of  efficiency,  stability  and  parsimony.  The  study  showed  that  the 
Recency  Model  produced  the  best  results  and  had  the  ability  to  capture  multiple 
production  breaks  along  the  same  learning  curve  (Nembhard  and  Osothslip  2001). 
However,  the  limitations  of  this  model  were  scrutinized  by  Svikstrom  and  Jaber  who 
argued  that  the  findings  were  not  consistent  with  fundamental  memory  literature  and 
there  is  still  no  consensus  today  on  the  best  forgetting  model. 

Many  forgetting  models  have  useful  aspects  from  an  internal  perspective  in  the 
private  sector,  but  their  use  may  be  limited  for  the  government.  These  models  are  used  to 
predict  starting  costs  after  production  breaks  or  evaluate  individual  performance.  One 
argument  against  the  use  of  forgetting  curves  in  military  production  is  that  while  military 
budgets  are  turbulent,  military  production  is  fairly  constant  and  spans  over  several  years. 
While  production  numbers  may  change  and  production  schedules  may  slip  and  cause 
programs  to  extend  the  life  of  their  contract,  production  breaks  are  very  rare.  Benkard 
(2000)  explains,  “Because  of  the  regularity  in  military  programs,  organizational 
forgetting  and  spillovers  of  production  experience  are  less  apparent.”  This  makes  the 
application  of  forgetting  models  difficult  and  at  times  inappropriate  within  the  DoD. 
However,  this  research  applies  the  concept  of  forgetting  over  time  even  while  progressing 
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along  a  learning  curve  rather  than  forgetting  due  to  production  breaks.  The  theory  at 
work  in  this  research  is  that  learning  rates  are  not  constant  (due  to  forgetting)  and  models 
that  do  not  assume  a  constant  learning  rate  may  be  more  applicable  to  DoD  estimating. 

There  is  some  DoD  literature  regarding  learning  lost  due  to  production  breaks 
despite  how  rarely  they  occur.  DoD  guidance  references  the  Anderlohr  method  as  a  way 
to  determine  the  amount  of  learning  lost  during  a  production  break.  George  Anderlohr 
(1969)  identifies  five  factors  that  influence  the  amount  of  learning  lost:  personnel 
learning,  supervisory  learning,  continuity  of  production,  methods,  and  special  tooling. 
Personnel  learning  refers  to  the  physical  loss  of  personnel  due  to  regular  movement  or 
lay-offs,  and  supervisory  learning  refers  to  supervisory  personnel  lost  due  to  regular 
movement.  Continuity  refers  to  the  production  line  itself,  and  how  closely  integrated  the 
workers  and  stations  are.  The  methods  of  production  are  typically  recorded  and 
documented,  so  there  is  very  little  if  any  learning  lost  in  this  area.  Special  tooling  refers 
to  wear  and  physical  damage  of  tooling  and  the  possible  need  of  newer  and  better 
equipment. 

These  five  factors  are  weighted  as  a  percentage  summing  up  to  100%  and  then 
those  weights  are  multiplied  by  the  percentages  of  learning  lost  in  each  category.  The 
sum  of  all  of  the  percentages  reflects  the  total  learning  lost  within  the  organization.  Once 
this  percentage  is  calculated,  it  is  added  to  the  production  cost  of  the  last  unit  produced  to 
estimate  the  cost  of  the  first  unit  after  production  break.  The  programs  used  in  this 
analysis  do  not  have  any  production  breaks  and  therefore  calculating  learning  lost  using 
the  above  methods  is  not  required.  However,  this  is  significant  because  it  begins  the 
progression  towards  accepting  a  learning  rate  that  is  not  constant  and  accepts  the 
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principle  behind  forgetting  within  the  DoD.  Conversely,  up  to  this  point,  that 
methodology  has  not  been  applied  to  the  learning  curve  models  used.  This  research  will 
look  to  build  upon  that  progress  and  assess  if  modern  models  can  be  applied  to  DoD  cost 
estimates.  The  next  section  will  address  this  issue  and  the  purpose  of  this  research. 

Problem  Statement 

Learning  curve  literature  and  theory  have  evolved  over  the  decades  and  the 
negative  effects  of  forgetting  are  widely  accepted  by  researchers  and  practitioners  alike. 
Technology  in  both  aircraft  design  and  manufacturing  has  also  continued  to  improve  over 
the  years  since  Wright  first  identified  the  relationship  between  learning  and  production 
costs.  However,  some  learning  curve  methodology  has  failed  to  keep  pace  with  this 
improvement.  DoD  guidance  in  both  the  AFC  AH  and  BCE  refer  to  Wright’s  model  as  the 
appropriate  learning  curve  application  for  cost  estimators.  While  the  validity  of  the 
Wright’s  original  theory  has  long  been  accepted,  the  need  to  integrate  the  impact  of 
forgetting  into  learning  curves  to  improve  accuracy  cannot  be  ignored. 

Badiru  et  al  address  the  issue  saying,  “In  defense-contractor  manufacturing  of 
airframes,  where  a  mix  of  contract  employees,  government  civilians,  and  military 
coordinators  can  exist,  the  issues  of  overall  learning,  unlearning,  or  half-learning  can 
become  very  significant”  (Badiru  et  al,  2013).  In  a  time  of  such  financial  turmoil  and 
uncertainty  amid  government  furloughs  and  sequestration,  exercising  every  tool  and 
method  available  to  improve  estimating  accuracy  should  be  paramount.  Badiru  et  al  also 
address  the  need  for  forgetting  curves  within  defense  cost  estimating  by  adding,  “With 
life-cycle  costing  that  stretches  over  generations  of  airframes,  breaks  in  production  are 
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not  the  exception,  but  rather,  the  rule.  Coping  with  these  production  gaps  and  properly 
estimating  the  associated  costs  is  of  primary  concern.”  This  paper  will  address  that  very 
issue  of  forgetting  curves  in  DoD  aircraft  production.  Later  chapters  investigate  whether 
defense  cost  estimators  should  incorporate  more  modem  learning  curve  models  into  their 
estimate  and  which  model  is  the  best  predictor. 

The  Air  Force  initiated  the  Better  Buying  Power  (BBP)  Initiative  in  2010.  This 
initiative,  currently  under  its  third  iteration,  sets  forth  a  group  of  core  acquisition 
principles  aimed  at  increasing  affordability  and  making  the  DoD  acquisition  process 
more  efficient.  BBP  encourages  innovation  and  elimination  of  wasteful  practices.  BBP 
consists  of  seven  core  focus  areas:  Achieve  affordable  programs,  control  costs  throughout 
product  lifecycle,  incentivize  productivity  and  innovation  in  industry  and  government, 
eliminate  unproductive  processes  and  bureaucracy,  promote  effective  competition, 
improve  tradecraft  in  acquisition  of  services,  and  improve  professionalism  of  the  total 
acquisition  workforce. 

One  possible  application  from  the  findings  of  this  research  is  in  should-cost 
estimates.  The  should-cost  initiative  falls  within  the  cost  control  focus  of  BBP  and  is 
focused  around  setting  cost  savings  goals.  Should-cost  is  the  concept  of  setting  cost 
targets  that  are  below  those  figured  from  independent  and  internal  program  cost  estimates 
(Better  Buying  Power  3.0,  2013).  These  targets  are  achieved  through  efficiencies  and 
changes  in  DoD  practices  and  culture  that  center  around  driving  down  program  costs. 
Finding  a  more  accurate  tool  for  predicting  the  effects  of  learning  may  be  a  way  of  setting 
and  achieving  these  targets. 
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Towill  and  Cherrington  (1994)  identify  three  primary  sources  for  estimating  error. 
The  first  of  which  being  errors  due  to  inevitable  fluctuations  in  performance  that  occur 
naturally.  Estimators  have  little  if  any  control  over  this  source.  The  second  is 
psychological,  physiological  or  environmental  cause  that  affect  deterministic  errors. 

These  can  be  accounted  for  by  estimators,  but  again  this  lays  largely  outside  of  their 
control.  The  final  source  for  prediction  error  is  modelling  error,  meaning  that  the  form  of 
the  model  used  may  be  inappropriate  and  therefore  not  fit  the  trend  line  of  the  data.  This 
thesis  will  address  the  third  issue  and  determine  the  model  form  which  is  most 
appropriate  to  fit  Defense  aircraft  over  a  production  life. 

Addressing  the  issue  identified  by  Towill  and  Cherrington  led  to  the  necessity  for 
this  research.  This  thesis  will  focus  around  a  comparison  of  three  modem  learning  curve 
models  (Stanford-B,  DeJong,  and  S-Curve)  to  Wright’s  learning  curve  model  which  is 
still  used  in  DoD  cost  estimating  today.  This  comparison  has  led  to  research  questions 
mentioned  in  Chapter  I  and  the  following  hypotheses: 

HI:  One  or  more  of  the  four  models  compared  will  have  Mean  Average  Percent 
Error  (MAPE)  significantly  different  from  the  others. 

H2:  One  of  more  of  the  modern  learning  curve  models  will  be  significantly  more 
accurate  than  Wright’s  learning  model  in  predicting  aircraft  costs. 

H3:  The  S-Curve  model  will  have  the  lowest  MAPE  and  prove  to  be  the  most 
accurate  predictor  of  aircraft  costs  over  time. 


29 


Conclusion 


This  chapter  serves  as  the  foundation  for  the  rest  of  this  paper  by  providing 
readers  with  a  basic  understanding  of  some  of  the  primary  concepts  that  lead  to  the 
research.  Learning  and  forgetting  are  both  evident  in  aircraft  manufacturing  and  failing 
to  incorporate  both  into  cost  estimating  can  be  detrimental  to  the  accuracy  of  future  cost 
estimates.  The  following  chapter  will  give  a  detailed  description  of  the  dataset  used,  the 
methods  applied  to  compare  the  four  models  and  any  assumptions  or  ranges  of  values  that 
were  used  in  each  of  the  models. 
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III.  Methods 


Introduction 

The  primary  theory  behind  this  research  is  that  modem  learning  curve  models, 
which  do  not  assume  a  constant  learning  rate,  provide  a  more  accurate  estimate  of  annual 
aircraft  production  costs  than  the  conventional  learning  curve  models  used  by  estimators 
today.  There  is  a  growing  interest  in  finding  ways  to  improve  the  accuracy  of  cost 
estimates  within  the  DoD;  one  way  of  doing  so  may  be  improving  the  accuracy  of 
learning  curves,  which  are  used  in  a  large  majority  of  estimates,  especially  those 
extending  over  long  life-cycles  (sometimes  over  30  years).  If  finding  a  more  accurate 
forecasting  model  is  possible,  then  finding  which  model  is  best  will  be  of  great  value. 

Part  of  that  theory  is  to  test  whether  the  results  of  these  models  are  significantly  different, 
and  if  so,  which  one  is  the  best  predictor.  Current  Department  of  Defense  (DoD) 
methodology  institutes  Wright’s  basic  learning  curve  equation  of  7)  =  T1xb,  which  is 
described  in  detail  in  Chapter  II.  While  Wright’s  model  has  long  been  used  successfully, 
it  neglects  to  include  the  effects  of  forgetting,  or  a  decline  in  performance  over  time. 
Forgetting  theory  has  several  applications  that  can  be  applied  in  multiple  learning  curve 
models  that  do  not  assume  a  constant  rate  of  learning. 

The  initial  task  is  to  determine  which  of  the  models  should  be  used  in  comparison 
to  conventional  learning  curves,  and  how  to  improve  upon  conventional  learning  curve 
application.  Several  learning  and  forgetting  curve  models  were  identified  for  application 
in  this  study,  but  three  models  were  selected  for  analysis  based  on  expert  opinion  from 
cost  analysts  who  confirmed  the  three  models  used  were  applicable  to  cost  estimators  and 
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the  relevance  to  the  available  data  from  Life  Cycle  Management  Center  Cost  Staff 
(AFLCMC/FCZ)  at  Wright-Patterson  AFB,  OH  (WPAFB)  and  other  on-line  repositories: 
the  Stanford-B  model,  DeJong’s  Learning  Formula,  and  the  S-Curve  model.  The 
conventional  model  lacks  the  application  of  key  factors  that  affect  learning:  prior 
experience  and  incompressibility.  Accounting  for  these  factors  can  reduce  the  amount  of 
estimating  error  for  airframe  costs,  and  even  an  error  reduction  of  up  to  5%  could  save 
millions  of  dollars  in  cost  overmns  over  the  life  of  a  program.  The  three  models  above 
account  for  one  or  more  of  these  un-leaming  factors,  which  can  be  easily  determined  by 
cost  estimators  and  quickly  applied  to  their  models.  That  applicability  and  ease  of  use  is 
the  another  driver  behind  using  the  three  afore  mentioned  models  in  this  study.  Providing 
a  model  that  takes  hours  or  days  of  secondary  analysis  and  data  collection  is  of  little 
practical  value  to  estimators,  even  if  it  is  more  accurate.  This  chapter  explains  how  those 
models  will  be  applied  to  the  data  in  this  study,  which  methods  will  be  used  to  compare 
them,  the  data  analyzed  in  this  research,  and  limitations  in  the  data  that  will  need  to  be 
addressed. 

Data  Collection 

Having  identified  the  three  models  for  analysis,  a  key  step  in  the  process  is 
collecting  the  data  needed  to  complete  a  meaningful  and  useful  comparison.  When 
initially  approached  by  the  members  of  AFLCMC/FCZ  to  find  a  more  accurate  way  of 
predicting  the  effects  of  learning,  they  were  confident  that  they  had  a  great  deal  of 
relevant  data  to  assist  with  the  task.  AFLCMC/FCZ  provided  learning  curve  data  for  17 
Major  Acquisition  Programs  (MDAPs).  These  data  files  consisted  of  Learning  Curve 
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Reports  of  Annual  Unit  Cost  (AUC)  averages  as  well  as  the  Special  Program  Office’s 
(SPO)  estimate  methods  using  the  conventional  learning  curve  model.  Many  of  the 
programs  were  already  completed  and  only  those  with  ten  or  more  years  of  data  had 
enough  information  to  be  useful.  However,  those  costs  were  the  unit  flyaway  cost,  for 
which  learning  curves  have  very  little  practical  use.  A  flyaway  cost  for  aircraft  consists 
of  prime  mission  equipment  such  as  basic  structure,  propulsion  and  electronic  systems, 
systems  engineering  and  program  management  (SE/PM),  allowances  for  engineering 
changes  (ECO)  and  warranties  (AFSC  Cost  Estimating  Handbook  Series,  1986).  Areas 
such  as  SE/PM,  ECO  and  warranties  do  not  experience  learning  in  the  way  the  learning 
models  depict  and  therefore  make  the  use  flyway  costs  in  this  analysis  irrelevant. 
Airframe  costs  were  chosen  for  this  analysis  for  a  number  of  reasons.  First,  using 
airframe  costs  allows  for  the  assumption  of  homogeneity  over  multiple  model  types.  It  is 
safe  to  assume  that  the  F-15  A/B,  C/D  &E  all  have  similar  if  not  identical  airframes 
making  it  easier  to  possible  to  compare  the  costs  and  continue  the  assumption  of  learning. 
Also,  in  foreign  military  sales  (FMS)  to  the  allies  of  the  U.S.,  the  airframe  of  the  aircraft 
will  likely  not  change  despite  changes  to  avionics  or  electronics  systems.  Also,  Badiru  et 
al  (2013)  state,  “as  rapid  emergence  of  new  technology  necessitates  that  airframe  designs 
and  manufacturing  processes  be  upgraded  frequently. . .  the  opportunity  for  forgetting 
clearly  increases.”  Therefore,  the  application  of  airframe  costs  to  this  study  will  provide 
results  consistent  with  that  theory. 

After  some  initial  research,  fighter  aircraft  became  the  primary  platform-type  for 
this  analysis  for  a  multitude  of  reasons.  The  first  reason  being  that  several  years  of 
production  data  exist  and  hundreds  of  units  were  produced  for  these  aircraft;  over  1150 
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aircraft  were  produced  in  a  twenty  year  span  for  the  F-15  alone.  Bailey  (1989)  stated  that 
forgetting  is  a  function  of  both  the  amount  of  learning  and  the  passage  of  time.  This 
makes  the  analysis  of  aircraft  production  cycles  spanning  over  several  years  a  prime 
candidate  to  exhibit  the  declining  performance  rate  attributed  to  forgetting.  The  second 
reason  is  that  there  are  several  models  of  fighters  (F-15  A-E  and  F-18  A-F  to  name  a  few) 
all  of  which  are  variants  of  the  same  basic  airframe  making  the  assumption  for 
comparison  of  airframe  costs  from  model  to  model  possible.  The  final  reason  for 
choosing  fighters  was  the  ability  to  work  face  to  face  with  cost  estimators  from  the 
program  offices  who  are  at  Wright-Patterson  AFB,  OH.  This  makes  collection  and 
interpretation  of  data  much  easier  than  a  long-distance  dialogue. 

The  initial  pool  of  aircraft  considered  for  analysis  consisted  of  five  fighters:  the 
Air  Force  F-15,  F-16,  and  F-22;  the  Navy  F/A-18;  and  the  joint  (Air  Force,  Navy  and 
Marines)  F-35.  The  F-35  was  eliminated  from  analysis  due  to  having  too  few  data.  The 
F-22  had  two  factors  which  eliminated  it  because  the  program  had  two  primary 
contractors,  Lockheed  Martin  Aeronautics  and  Boeing  Defense,  Space  &  Security,  both 
of  whom  contributed  components  to  the  airframe  production  making  it  difficult  to 
measure  the  effects  of  learning  by  one  against  the  other.  For  this  reason,  it  would  not 
provide  a  suitable  comparison  to  other  aircraft  being  tested.  The  F-16  was  a  prime 
candidate  for  analysis  given  the  long  production  life  and  model  upgrade,  but  relevant 
airframe  data  were  incomplete  or  missing  completely  in  some  cases.  The  F/A-18  had 
sufficient  available  data,  but  the  program  switched  primary  contractors  making  it  difficult 
to  homogenously  compare  the  costs  over  that  transition.  This  left  the  F-15  as  the  primary 
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platform  for  analysis  based  on  production  history  and  availability  of  relevant  airframe 
costs. 

F-15  airframe  costs  were  discovered  in  two  data  bases.  The  F-15  A-D  airframe 
lot  averages  were  acquired  from  the  Cost  Estimating  System  Volume  2  Aircraft  Cost 
Handbook  published  in  1987  by  the  Delta  Research  Corporation.  This  handbook 
includes  all  19  lot  purchases  from  1970-1985  and  details  the  quantity  produced  as  well  as 
the  total  airframe  costs  (minus  administrative  costs).  This  data  was  presented  in  Base 
Year  1987  dollars  (BY$87),  meaning  that  the  values  for  each  year  are  set  at  a  fixed  price 
as  if  all  of  the  funds  were  expended  in  1987  ( AFCAH ,  2007).  Summarized,  this  statement 
means  that  each  of  the  values  were  initially  represented  as  their  equivalent  purchasing 
power  in  the  year  1987. 

The  F-15E  data  was  taken  directly  from  the  Joint  Cost  Analysis  Research 
Database  (JCARD)  system.  This  data  was  much  more  detailed  and  included  five  of  the 
six  lot  purchases  with  Lot  1  data  missing.  The  system  had  data  broken  out  into  each  cost 
element  (including  airframe)  and  the  total  quantity  produced.  The  JCARD  data  was  in 
Then  Year  dollars  (TY$)  which  are  BY$  inflated/deflated  to  represent  the  purchasing 
power  of  the  funds  if  they  were  expended  in  that  given  year  (AFCAH,  2007).  Both  the  F- 
15  A-D  BY$87  values  and  the  F-15E  TY$  values  are  standardized  in  this  research  to  a 
Base  Year  2014  (BY$14)  value  using  the  2014  OSD  Inflation  Tables.  The  OSD  inflation 
tables  are  published  every  year,  and  this  research  was  begun  in  2014  so  those  tables  have 
been  used  to  avoid  crossing  over  to  and  from  inflation  tables.  This  step  ensures  that  all 
dollar  amounts  are  compared  on  a  level  plane  in  and  also  represent  a  dollar  value  that  is 
relevant  to  today’s  economy. 
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The  unit  theory  data  of  the  entire  F-15  A-E  data  set  is  shown  below  in  Figure  11. 
The  data  indicate  that  there  are  clear  signs  of  forgetting  in  the  later  stages  of  the 
production  cycle.  The  average  unit  cost  is  actually  increasing  towards  the  end  of 
production  rather  than  decreasing  as  would  be  the  case  with  learning  theory. 
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The  F-15  data  appears  to  show  significant  signs  of  declining  performance  over  the 
program’s  life  cycle.  Figure  12  below  shows  the  cumulative  actual  average  flyaway  cost 
plotted  against  the  cumulative  unit  number.  Clear  signs  of  forgetting  over  time  and  a 
decline  in  performance  can  be  seen  from  sharp  flattening  trend  in  the  data.  After  the 
production  of  around  600  units,  the  effects  of  learning  nearly  come  to  a  complete  stop 
and  in  some  cases,  the  costs  actually  increase  over  time. 
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Figure  12:  F-15  A-E  Actual  Costs 


When  the  F-15  cumulative  average  unit  costs  are  plotted  on  a  log-log  graph 
another  significant  trend  becomes  evident.  Figure  13  below  shows  the  log-log  graph  with 
a  linear  regression  line  to  provide  a  frame  of  reference.  A  clear  S-shaped  curve  can  be 
seen  from  the  data  with  a  flattening  tail  towards  the  bottom  of  the  curve.  This  indicated 
that  there  are  diminishing  returns  at  the  end  of  the  production  cycle  and  the  rate  of 
improvement  is  not  constant  over  the  life  of  the  program. 

The  goal  of  this  study  is  to  identify  a  model,  or  models,  which  more  accurately 
predicts  the  decline  in  performance  over  time  and  provides  more  accurate  estimates  for 
airframe  costs  than  Wright’s  contemporary  model.  For  this  research,  the  F-15  A/B  lots 
will  be  treated  as  historical  data  and  each  of  the  models  will  be  used  to  estimate  the  costs 
for  the  C/D  and  E  lots  based  on  that  data.  This  scenario  allows  for  the  simulation  of  a 
real-world  cost  estimating  scenario  rather  than  a  controlled  study  where  the  data  are 
treated  in  a  way  that  is  beneficial  to  the  researcher. 
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Figure  13:  F-15  Actuals  Log-Log  Plot 


Learning  Curve  Models 

Wright’s  Learning  Curve 

The  status  quo  for  the  learning  curve  models  is  Wright’s  model  which  take  the 
form  Tx  —  Tlx~b .  The  parameters  of  the  model  are  detailed  in  Chapter  II.  The  two 
parameters  that  must  be  determined  to  perform  an  estimate  are  Tt  and  h.  In  common  cost 
estimating  practices,  b  and  Tx  are  determined  through  a  linear  regression  on  a  plot  of  the 
natural  log  of  cumulative  unit  number  [ln(x)]  against  the  natural  log  of  the  actual  reported 
costs  [ln(y)].  This  regression  will  determine  whether  the  cumulative  average  or  unit 
learning  curve  theory  should  be  applied  to  the  data.  The  regression  providing  the  most 
accurate  fit  as  according  to  the  R2  value  will  determine  whether  unit  theory  or  cumulative 
average  theory  will  be  used  for  the  duration  of  the  study  and  the  regression  equation  from 
that  method  will  determine  the  parameters  for  the  model.  R2  is  a  simple  goodness  of  fit 
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measure  that  represents  the  amount  of  variance  between  the  independent  and  dependent 
variables  explained  as  a  percentage.  In  other  words,  it  represents  the  amount  of 
variability  that  can  be  explained  by  the  model  (McClave,  Benson,  and  Sincich  201 1). 
From  the  linear  regression  b  is  simply  the  slope  of  the  line  and  Tj  is  derived  by  taking  the 
natural  log  of  the  y-intercept.  Once  these  two  parameters  are  determined  for  the  Wright 
model,  they  remain  constant  for  the  other  3  models  used  in  this  analysis. 

Stanford-B  Model 

The  first  model  selected  for  comparison  was  the  Stanford  B-model.  The  Stanford 
B-model  is  a  relatively  older  application  of  the  learning  curve  using  the  equation  7)  — 

7j  (x  +  B)~h .  The  parameters  of  the  model  are  described  in  Chapter  II,  but  the  point  of 
interest  in  the  equation  is  the  equivalent  experience  unit  constant  represented  by  the 
constant  B.  The  B  constant  falls  between  0  and  10,  and  represents  the  equivalent  units  of 
previous  experience  at  the  start  of  the  production  process.  If  more  than  10  units  have 
been  produced,  then  the  constant  remains  at  10.  This  parameter  accounts  for  how  many 
times  the  process  has  already  been  completed  and  adjusts  the  learning  curve  based  on  that 
number.  The  Stanford-B  model  is  only  a  slight  derivation  from  Wright’s  traditional 
learning  curve  model,  and  when  B  is  equal  to  the  first  unit  produced  then  the  models  are 
identical  (Badiru  et.  al,  2013).  Properly  applying  previous  experience  into  the  model  is 
the  key  to  using  this  equation  and  for  this  study  B  is  represented  by  the  number  of 
previous  units  produced.  This  can  be  in  the  form  of  prototypes,  test  aircraft,  or  any  other 
relevant  production  unit  that  was  not  part  of  the  F-15  A/B  production  lines.  There  were 
20  test  units  produced  beginning  in  1970  which  will  be  counted  for  prior  experience  and 
therefore  the  factor  B  will  be  ten.  This  prior  experience  unit  constant  of  ten  will  remain 
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consistent  when  used  in  the  S -Curve  model  described  below.  With  B  determined,  the 


data  is  incorporated  into  the  model  to  estimate  the  total  lot  costs  for  the  15  remaining  F- 
15  C/D  and  E.  The  residuals  from  these  estimates  when  compared  to  the  actual  lot  costs 
are  then  compared  to  each  of  the  other  three  models.  Methods  for  the  comparisons  will 
be  covered  later  in  this  chapter. 

De Jong’s  Model 

The  second  model  considered  for  comparison  was  the  DeJong  Learning  Formula. 
DeJong’s  model  is  essentially  a  simple  power  function,  similar  to  Wright’s  model,  which 
accounts  for  the  percentage  of  the  task  that  requires  mechanical  activity  to  the  amount 
that  is  touch  labor.  The  effects  of  learning  are  typically  only  seen  in  touch,  or  human, 
labor  because  there  are  often  very  little  improvements  in  machine  efficiency  over  time. 
The  basic  form  of  this  learning  curve  is  Ti  =  T1  +  Mx~b .  Unlike  previous  models, 
DeJong’s  model  incorporates  the  incompressibility  factor  (M);  however,  there  is  no 
equivalent  experience  constant.  The  incompressibility  factor,  M,  is  a  constant  between  0 
and  1  where  0  represents  a  fully  manual  process  and  1  represents  a  machine-dominated 
process  (Badiru  et.  al,  2013).  Aircraft  production  falls  somewhere  in  between  the  two, 
but  there  is  no  precedent  set  for  application  to  aircraft  production.  A  U.S.  Bureau  of 
Labor  Statistics  report  from  June  1993  gives  the  following  description  of  the  industry; 
‘'[Although  the  industry  assembles  a  high-tech  product,  its  assembly  process  is  fairly 
labor  intensive,  with  relatively  little  reliance  on  high-tech  production  techniques” 
(Kronemer  and  Henneberger,  1993).  This  report  indicates  that  the  highly  specialized 
process  of  aircraft  production,  similar  to  that  of  high-end  performance  automobiles, 
supports  a  proper  application  of  M  closer  to  0  than  1.  Where  exactly  that  number  falls  is 
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undefined  and  leads  to  some  subjectivity.  In  order  to  avoid  any  biases  that  may  skew  the 
results  and  apply  robustness  to  the  analysis,  the  application  of  the  constant  will  start  at 
0.0  and  move  to  0.2  in  increments  of  0.05  resulting  in  5  sets  of  analysis.  This  range 
incompressibility  factors  will  remain  consistent  in  the  application  of  the  S-Curve  model 
as  well. 

S-Curve  Model 

The  third  and  final  model  that  will  be  used  for  comparison  in  this  study  is  the  S- 
Curve  Model,  which  was  developed  by  Towill  and  Cherrington  in  1994.  The  S-Curve 
model  is  a  combination  of  the  Stanford-B  model  and  DeJong’s  model.  As  mentioned  in 
Chapter  II,  this  model  is  based  on  the  assumption  of  gradual  build-up  early  on  in 
production,  a  period  of  steady  learning,  and  flattened  portion  at  the  top  of  the  S-curve 
called  the  slope  of  diminishing  returns  often  attributed  to  forgetting.  The  basic  S-Curve 
model,  Tj  —  Tx  +  M (x  +  B)~b,  uses  the  same  previous  experience  unit  constant,  B,  and 
incompressibility  factor,  M,  as  the  Stanford-B  and  DeJong  models  respectively.  Three  of 
the  four  variables  on  the  right  side  of  the  equation  (7),  b.  M  and  B )  must  be  known  to 
make  an  assumption  about  the  fourth  (Badiru  et.  al,  2013).  In  this  study,  we  will  use  the 
same  known  7),  b,  and  B  used  in  the  prior  equations  to  make  an  educated  assumption 
about  M  as  described  in  the  DeJong  model  above.  The  S-Curve  model  is  a  very  strong 
representation  of  how  forgetting  will  affect  the  rate  of  learning  and  is  a  sound  model  to 
use  in  testing  the  theory. 
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Research  Hypotheses 

As  previously  mentioned,  the  primary  theory  for  this  study  is  that  at  least  one  of 
these  alternative  learning  curve  models  are  more  accurate  predictors  of  actual  production 
costs  than  traditional  learning  models.  This  theory  is  founded  on  the  belief  that  forgetting 
occurs  in  airframe  production  and  models  that  do  not  assume  a  constant  rate  of  learning 
will  provide  a  more  accurate  estimate.  The  research  hypothesis  for  this  theory  is  that 
there  is  a  significant  difference  between  the  mean  average  percent  error  (MAPE)  of  the 
predicted  lot  costs  between  four  models.  MAPE  is  a  measure  of  variation  that  takes  the 
average  of  the  absolute  values  from  the  error  of  each  prediction.  The  absolute  value  is 
taken  to  avoid  any  cancelling  out  of  positive  and  negative  error  values.  The  smaller  the 
MAPE,  the  more  accurate  and  reliable  the  estimates.  This  theory  led  to  the  following 
research  hypothesis: 

HI:  One  or  more  of  the  alternative  learning  curve  models  has  a  MAPE 
statistically  different  from  the  conventional  DoD  model. 

H2:  One  or  more  of  the  alternative  learning  curve  models  is  more  accurate  than 
the  conventional  DoD  model. 

H3:  The  S-Curve  model,  accounting  for  both  prior  experience  and 

incompressibility,  will  be  the  most  accurate  predictor  of  airframe  costs. 

The  null  hypothesis  (H0)  for  the  first  hypothesis  in  this  study  is  that  ^  =  /r2  =  /W3  =  /U4, 
meaning  all  of  the  MAPEs  are  the  same,  against  the  alternative  hypothesis  (Ha)  that  at 
least  one  of  the  models  has  a  mean  that  is  different.  If  the  null  hypothesis  can  be  rejected 
and  there  is  evidence  to  a  support  significant  difference,  then  it  will  be  necessary  to  test 
each  of  the  new  learning  models  against  the  conventional  model.  The  second  null 
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hypothesis  mathematically  states  that  fa  =  fa  where  i  =  2,  3,  4  to  be  tested  against  the  Ha: 
fa  >  fa.  These  individual  hypotheses  test  whether  each  of  the  modem  learning  curve 
models  have  a  MAPE  significantly  lower  than  the  conventional  model.  One  final  test 
will  be  to  investigate  the  third  hypothesis  and  determine  which  of  these  models  that  have 
displayed  significantly  smaller  mean  errors  from  the  conventional  model  is  the  best 
predictor.  The  third  null  hypothesis  states  that  fa  —  jij,  where  i  and  j  are  both 
significantly  lower  than  fa,  to  be  tested  against  the  Ha:  fa  <  Hj.  That  analysis  will 
provide  an  answer  to  the  initial  inquiry  of  this  thesis  of  determining  if  there  is  an 
alternative  best  fit  model  that  is  more  accurate  that  Wright’s  model. 

Analysis  Methods 

Once  the  data  is  standardized  to  BY$14  averages,  the  estimates  from  each  of  the 
models  will  be  placed  in  a  spreadsheet  seen  below  in  Table  2,  with  a  column  for  the 
actual  lot  costs,  as  well  as  a  column  for  each  of  the  predicted  lot  costs  using  one  of  the 
four  models  described  above.  There  will  also  be  a  column  for  cumulative  units  and  lot 
number.  The  error  column  is  the  difference  between  the  actual  and  predicted  (Unit  or 
Cumulative  Average  Theory)  values.  Absolute  error  (Abs  Error)  is  simply  the  absolute 
value  of  the  error,  and  absolute  percent  error  (Abs  PE)  is  the  absolute  error  divided  by  the 
actual  cost. 

Once  the  tables  have  been  populated,  the  next  step  is  to  perform  the  analysis  of 
data  and  test  the  hypotheses.  For  the  overall  research  hypothesis  fa  =  fa  —  fa  =  fa,  the 
set  of  percent  errors  will  be  compared  using  either  an  ANOVA  or  Kruskal- Wallis  test 
with  IBM®  SPSS  statistics  software.  These  tests  produce  an  F-statistic  falling  within  a 
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Chi-distribution  and  a  resulting  p-value  that  can  reject  or  fail  to  reject  the  null  hypothesis 


based  on  the  given  confidence  level  that  will  be  addressed  later  in  this  section.  The  null 
hypothesis  in  this  case  is  that  all  of  the  sample  means  are  the  same,  being  tested  against 
the  alternative  hypothesis  that  at  least  one  of  the  sample  means  is  different. 


Table  2:  Example  of  Data  Table  (Predicted  vs.  Actual) 

Wright  Learning  Curve 


Lot  Units  Cumm  Units  Actual  Lot  Cost 

Predicted  Lot  Cost 

Error 

Abs  Error 

Abs  PE 

1 

30 

30  $  852,826.86 

2 

62 

92  $1,350,530.04 

3 

72 

164  $1,282,332.16 

4 

132 

296  $2,067,667.84 

5 

21 

317  $  346,113.07 

6 

108 

425  $1,691,696.11 

7 

97 

522  $1,603,356.89 

$ 

1,691,386.31 

$ 

(88,029.42) 

88029.42303 

0.054903199 

8 

94 

616  $1,450,706.71 

$ 

1,585,219.83 

$ 

(134,513.12) 

134513.1182 

0.092722476 

9 

62 

678  $1,145,759.72 

$ 

1,021,354.05 

$ 

124,405.67 

124405.6656 

0.108579193 

10 

60 

738  $1,026,855.12 

$ 

972,387.31 

$ 

54,467.81 

54467.80998 

0.053043325 

11 

15 

753  $  272,791.52 

$ 

240,819.76 

$ 

31,971.76 

31971.76115 

0.117202181 

12 

46 

799  $  840,106.01 

$ 

733,188.48 

$ 

106,917.53 

106917.5318 

0.127266715 

13 

36 

835  $  706,890.46 

$ 

568,463.82 

$ 

138,426.64 

138426.642 

0.195824742 

14 

39 

874  $  665,194.35 

$ 

610,849.27 

$ 

54,345.08 

54345.0799 

0.081698048 

15 

36 

910  $  605,830.39 

$ 

559,487.49 

$ 

46,342.90 

46342.90206 

0.076494846 

16 

42 

952  $  729,328.62 

$ 

647,695.92 

$ 

81,632.70 

81632.70346 

0.111928561 

17 

48 

1000  $  798,870.89 

$ 

733,921.93 

$ 

64,948.96 

64948.95581 

0.081300942 

18 

42 

1042  $  694,080.06 

$ 

636,953.56 

$ 

57,126.50 

57126.49757 

0.082305344 

19 

42 

1084  $  693,381.43 

$ 

632,316.73 

$ 

61,064.70 

61064.70419 

0.088067983 

20 

36 

1120  $  586,856.87 

$ 

538,456.02 

$ 

48,400.85 

48400.84906 

0.082474708 

21 

36 

1156  $  613,192.10 

$ 

535,328.10 

$ 

77,864.00 

77863.99559 

0.126981408 

MAPE  = 

9.87% 

ANOVA  requires  three  conditions  for  valid  results:  the  samples  must  be  randomly 
selected  from  the  population;  the  samples  have  distributions  that  are  approximately 
normal,  and  the  population  variances  must  be  equal  (McClave,  Benson,  Sincich  2011). 
The  samples  are  random  in  the  sense  that  there  was  no  selection  process  from  the  data 
samples  collected.  The  normality  of  the  data  will  be  addressed  in  Chapter  IV  through  a 
group  of  histograms  using  Microsoft  Excel.  A  histogram  can  be  used  to  display  the 
frequency  of  measurements  and  will  thus  provide  insight  into  the  shape  of  the  distribution 
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(McClave  et  al,  2011).  The  equality  of  the  variances  will  be  tested  by  dividing  the  largest 
sample  standard  deviation  by  the  smallest  standard  deviation.  As  a  rule  of  thumb,  if  that 
value  is  two  or  less,  then  the  variances  can  be  assumed  equal.  If  these  conditions  are  not 
met,  the  analysis  will  use  a  non-parametric  test  to  investigate  the  first  hypothesis;  non- 
parametric  tests,  unlike  ANOVA,  do  not  require  an  assumption  of  normal  distribution. 
The  Kruskal-Wallis  test  can  be  used  to  determine  if  multiple  samples  arise  from  the  same 
distribution  and  have  the  same  parameters  (Kruskal  &  Wallis,  1952).  F-test  from  the 
initial  ANOVA  or  Kruskal- Wallis  test,  both  performed  in  SPSS,  will  provide  insight  into 
the  first  hypothesis.  If  the  F-statistic  is  significant,  then  the  data  rejects  the  null 
hypothesis  and  at  least  one  of  the  sample  means  is  different. 

To  test  the  second  hypothesis  that  at  least  one  of  the  models  is  more  accurate  this 
research  will  use  Dunnett’s  test  performed  in  SPSS.  Dunnett’s  test  is  used  to  compare 
multiple  sample  means  to  one  value  held  as  the  control  (Everett  &  Schrondal,  2010). 
Wright’s  learning  curve  model,  the  status  quo,  will  be  used  as  the  control  for  this  study 
and  the  significance  will  be  used  to  test  if  any  of  the  other  model’s  MAPE  values  are  less 
than  (<)  the  control.  If  the  assumption  for  equal  variance  is  not  met,  Dunnett’s  T3  test 
will  be  used  for  comparing  the  sample  means.  The  T3  is  similar  to  Dunnett’s  test 
described  above,  but  it  uses  each  sample  as  a  control  individually  to  compare  against  the 
other  values. 

The  final  analysis  will  be  to  test  which  model  is  most  accurate  given  significant 
results  for  more  than  one  model  from  the  second  hypothesis.  This  analysis  will  be 
conducted  through  a  simple  paired  difference  t-test  again  performed  in  SPSS.  A  paired 
difference  experiment  uses  a  probability  distribution  when  comparing  two  sample  means 


45 


and  produces  a  t-statistic  that  falls  within  a  student-t  distribution  that  can  either  reject  or 
fail  to  reject  the  null  hypothesis  depending  on  the  desired  confidence  level  (McClave  et 
al,  2011).  If  the  assumption  for  equal  variances  is  not  met  and  the  T3  test  is  used, 
information  regarding  which  models  are  significantly  different  will  be  found  in  the  T3 
test  and  there  will  be  no  need  for  paired  t-tests. 

For  this  study,  an  a  of  0.05  will  be  used,  meaning  that  the  results  will  produce 
results  with  95%  confidence.  For  purposes  of  this  analysis,  this  a  value  means  that  F- 
statistic  (or  t-statistic)  with  a  resulting  p-value  <  0.05  will  reject  the  null  hypotheses  and 
support  the  alternative  hypothesis  that  the  mean  values  between  the  models  are  different. 
A  p-value,  or  observed  significance  level,  is  defined  as  “the  probability  (assuming  H0  is 
true)  of  observing  a  value  of  the  test  statistic  that  is  at  least  as  contradictory  to  the  null 
hypothesis,  and  supportive  of  the  alternative  hypothesis,  as  the  actual  one  computed  from 
the  sample  data”  (McClave  et.  al,  2011).  In  other  words  the  p-value  is  the  chance  of 
having  an  actual  result  that  is  contradictory  to  the  sample  result.  By  rejecting  the  null 
hypothesis,  the  data  is  essentially  demonstrating  that  there  is  a  95%  chance  the  means  of 
the  two  populations  are  different. 

Conclusion 

Assuming  that  all  H0  are  rejected  in  favor  of  the  Ha  and  production  rate  does  not 
have  a  significant  effect  on  the  accuracy  of  the  models,  the  results  of  this  study  can 
provide  a  valuable  proxy  into  future  research  and  application.  If  it  can  be  shown  that  one 
of  the  models  is  significantly  more  accurate  than  the  others,  then  those  results  can  be 
presented  for  further  analysis  and  possibly  be  enacted  into  DoD  policy.  At  minimum  the 
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results  can  provide  analysts  with  a  methodology  cross-check,  which  will  be  explained  in 
greater  detail  in  Chapter  5.  The  following  section  will  show  detailed  results  from  the 
analysis.  Each  of  the  tables  and  a  description  of  the  data  as  well  as  the  final  results  from 
each  of  the  t-tests  will  be  included.  Chapter  IV  will  not  include  the  interpretation  and 
meaning  of  the  results,  that  discussion  and  potential  impacts  of  the  findings  will  be 
included  in  Chapter  V. 
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IV.  Results 


Introduction 

The  following  section  contains  the  results  from  the  tests  and  methods  described  in 
Chapter  III.  Chapter  IV  attempts  to  answer  the  three  primary  research  questions 
proposed  earlier  in  this  research:  first,  is  one  or  more  of  the  alternative  learning  curve 
models  statistically  different  from  Wright’s  conventional  model;  second,  is  one  or  more 
of  the  alternative  learning  curve  models  statistically  more  accurate  than  Wright’s 
conventional  model,  and  third,  which  model  is  the  most  accurate.  The  following  graphs 
and  charts  will  attempt  to  answer  these  questions,  and  will  be  accompanied  by  a  brief 
description  of  the  results  shown  within.  This  analysis  will  begin  by  investigating  the  F- 
15  C/D  &  E  models  using  the  A/B  model  as  historical  data.  Discussion  on  the 
implications  of  the  findings,  limitations  of  the  study,  and  possible  areas  for  further 
research  within  the  area  will  be  reserved  for  Chapter  V. 

F-15  C-E  Analysis 

Unit  Theory  &  Cumulative  Average  Theory 

The  first  step  of  the  analysis  was  to  identify  which  learning  theory  was  most 
appropriate  for  the  given  data.  For  the  F-15  data  using  an  M  value  of  0.20,  a  log-log 
regression  was  run  against  the  A/B  model  data  for  using  both  the  unit  theory  and 
cumulative  average  theory  to  predict  the  learning  parameters  for  the  C/D  and  E  models 
used  in  the  analysis.  Figure  13  below  shows  the  regression  using  the  cumulative  average 
theory  which  produced  an  R~  value  of  0.9951.  Using  the  entire  data  set  (shown 
previously  in  Figure  12)  produced  a  much  lower  R~  value  of  .9167,  and  the  parameters 
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from  the  A/B  model  regression  were  used  because  they  better  explained  the  learning 
taking  place.  The  cumulative  average  R~  value  for  the  A/B  model  was  slightly  higher 
than  the  0.9735  value  produced  using  the  unit  theory  data  (regression  graph  can  be  seen 
in  Appendix  A).  This  indicates  that  the  cumulative  average  theory  should  be  used  for 
estimating  the  C-E  model  costs  and  the  lot-plot  point  assumption  holds  for  the  data. 

These  results  also  provide  the  basic  parameters  for  all  four  learning  models  used 
in  the  study.  The  learning  rate  factor,  b,  is  the  slope  of  the  linear  regression  line,  which  in 
this  case  is  -0.1813.  This  value  indicates  a  learning  curve  slope  of  88.19%  ( LCS  —  2b). 
Figure  13  also  provides  information  into  the  T )  value  that  will  be  used  in  the  analysis. 

The  intercept  of  the  linear  regression  equation  is  the  natural  log  of  the  theoretical  unit  1, 
T],  value.  By  raising  the  mathematical  constant  e  to  the  value  of  the  intercept  (10.883), 
one  can  determine  the  average  cost  of  the  theoretical  first  unit;  in  this  case,  that  value  is 


$53,263K. 


F-15  Cumm  Avg  Log-Log  Regression 
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Figure  14:  F-15  A/B  Fog-Fog  regression 
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Assumption  Parameters 

The  next  step  was  populating  the  data  tables  so  that  the  comparative  analysis 
could  be  run.  Table  3  below  shows  the  APE  values  for  all  15  lots  calculated  using  each 
of  the  four  learning  models  with  an  incompressibility  factor  of  0.1.  As  the  table  shows, 
Wright’s  Curve  and  the  Stanford-B  models  initially  has  the  lowest  MAPE  of  the  four 
models,  but  analysis  must  be  conducted  to  determine  if  there  is  a  significant  difference  in 
the  data.  Then  that  analysis  will  be  applied  to  a  range  of  incompressibility  factors  to 
determine  how  sensitive  the  results  are  to  a  change  in  that  factor. 


Table  3:  F-15  APE  Values  for  Each  Model 


M  =0.1 

Lot  WLC 

Stanford-B 

DeJong 

S-Curve 

7  0.0549032 

0.0509017 

0.2716447 

0.2680433 

8  0.0927225 

0.0892703 

0.3285742 

0.3254672 

9  0.1085792 

0.1085792 

0.0904993 

0.0882712 

10  0.0530433 

0.0554482 

0.1634820 

0.1613176 

11  0.1172022 

0.1193309 

0.0873964 

0.0854805 

12  0.1272667 

0.1292897 

0.0771023 

0.0752816 

13  0.1958247 

0.1975958 

0.0049876 

0.0065815 

14  0.0816980 

0.0836323 

0.1387508 

0.1370100 

15  0.0764948 

0.0783588 

0.1476580 

0.1459804 

16  0.1119286 

0.1136465 

0.1059919 

0.1044458 

17  0.0813009 

0.0829968 

0.1468597 

0.1453335 

18  0.0823053 

0.0839250 

0.1482298 

0.1467721 

19  0.0880680 

0.0896143 

0.1433682 

0.1419766 

20  0.0824747 

0.0839757 

0.1525089 

0.1511580 

21  0.1269814 

0.1283646 

0.0984203 

0.0971754 

AVG  0.0987196 

0.0996620  0.1403649 

0.1386863 

In  order  to  test  the  samples,  certain  assumptions  must  be  tested.  The  assumption 
of  normality  was  not  met,  meaning  that  non-parametric  tests  must  be  used  for  comparing 
the  means.  Table  4  below  shows  the  skewness  and  kurtosis  values  for  each  of  the 
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samples  with  an  M  value  of  0. 1.  Kurtosis,  is  a  measure  of  the  peakedness  of  the 
distribution. 


Table  4:  F-15  Descriptive  Statistics  (M=0.1) 


N 

Mean 

Std.  Deviation 

Skewness 

Kurtosis 

Statistic 

Statistic 

Statistic 

Statistic 

Std.  Error 

Statistic 

Std.  Error 

WLC 

15 

.0987 

.03529 

1.426 

.580 

3.247 

1.121 

Stan_B 

15 

.0997 

.03584 

1.378 

.580 

3.134 

1.121 

DeJong 

15 

.1404 

.07749 

1.031 

.580 

2.090 

1.121 

S_Curve 

15 

.1387 

.07663 

1.052 

.580 

2.086 

1.121 

Valid  N  (listwise) 

15 

High  kurtosis  values  are  assumed  to  be  non-normal  and  result  in  a  sharply  peaked 
distribution.  Histograms  for  each  of  the  samples  are  provided  in  Appendix  B,  and  the 
effects  of  the  kurtosis  are  displayed  visually.  All  of  the  samples  also  have  a  skewness 
greater  than  one,  so  normality  cannot  be  assumed.  The  KW  test  must  be  used  to 
determine  if  the  sample  distributions  are  significantly  different  and  if  at  least  one  sample 
has  a  median  different  from  the  others. 

The  assumption  for  equal  variances  must  also  be  tested  by  dividing  the  largest 
sample  standard  deviation  by  the  smallest  standard  deviation  (er).  The  DeJong  model  had 
the  highest  a  with  a  value  of  0.07749  and  the  Wright  (WLC)  Model  had  the  smallest  a 
with  a  value  of  0.03529.  Dividing  the  WLC  o  by  the  S-curve  o  equates  to  a  value  of 
2.19,  which  is  much  larger  than  two  meaning  that  the  variances  are  assumed  to  be 
unequal.  This  value  indicates  that  the  Dunnett  T3  test  must  be  used  to  compare  the 
means  for  this  analysis. 
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Means  Comparison 

Since  the  samples  are  not  normally  distributed,  the  KW  test  is  used  to  test  if  the 
samples  are  significantly  different.  The  KW  test  will  analyze  the  null  hypothesis  that  the 
distribution  of  the  APE  value  is  the  same  regardless  of  model  type.  Table  5  below  shows 
the  KW  test  results  for  an  M  value  of  0.1.  As  the  table  shows,  the  p-value  of  0.028  is 
significant  and  therefore  rejects  the  null  hypothesis  indicating  that  at  least  one  of  the 
sample  distributions  is  significantly  different  from  the  others.  This  result,  that  the 
distributions  are  significantly  different,  indicates  that  there  is  a  chance  that  the  means  of 
the  samples  are  different.  This  process  was  repeated  using  the  full  range  of  M  values 
from  0.0  to  0.2.  The  results  were  consistent  across  the  range  except  for  0.0  which  had  no 
statistical  difference.  The  results  of  these  Kruskal- Wallace  tests  can  be  seen  in  Appendix 
C. 

Table  5:  F-15  Kruskal-Wallis  Test  Results  (M  =  0.1) 


Null  Hypothesis 

Test 

Sig. 

Decision 

a  The  distribution  of  APE  is  the  same 
across  categories  of  Model. 

Independent- 
Samples 
Kruskal- 
Wallis  Test 

.028 

Reject  the 
null 

hypothesis. 

Asymptotic  significances  are  displayed.  The  significance  level  is  .05. 


The  following  step  was  to  determine  if  the  means  are  statistically  different  and 
which  models  are  accounting  for  that  difference.  The  Dunnett  T3  test  was  used  as  a  post- 
hoc  ANOVA  analysis  because  the  variances  are  assumed  to  be  unequal.  Table  6  below 
illustrates  the  results  of  the  post-hoc  analysis.  For  the  purposes  of  the  analysis  in  SPSS, 
the  models  were  each  assigned  numbers:  Wright’s  Teaming  Curve  is  Model  1,  the 
Stanford-B  is  Model  2,  the  DeJong  Formula  is  Model  3,  and  the  S-Curve  is  Model  4.  For 


52 


this  test  however,  the  means  are  not  significantly  different.  All  of  the  p-values 


(represented  by  the  sig.  column)  are  much  greater  than  0.05  indicating  that  although  the 
distributions  are  different,  the  means  of  those  distributions  are  not. 

The  final  step  was  to  test  which  model  was  the  most  accurate.  However,  none  of 
the  models  are  statistically  different  and  therefore  the  results  are  inconclusive  for  which 
model  is  most  accurate  if  the  incompressibility  factor  is  assumed  to  be  0.1.  In  the 
following  sections,  this  means  comparison  will  be  repeated  for  the  full  range  of  M  values 
from  0.0-0. 2. 


Table  6:  F-15  Dunnett  T3  Test  (M=0.1) 


(1)  Model 

(J)  Model 

Mean 

95%  Confidence  Interval 

Difference  (l-J) 

Std.  Error 

Sig. 

Lower  Bound 

Upper  Bound 

1.00 

2.00 

-.00094 

.01299 

1.000 

-.0375 

.0356 

3.00 

-.04165 

.02199 

.343 

-.1055 

.0222 

4.00 

-.03997 

.02178 

.376 

-.1032 

.0232 

2.00 

1.00 

.00094 

.01299 

1.000 

-.0356 

.0375 

3.00 

-.04070 

.02204 

.369 

-.1047 

.0233 

4.00 

-.03902 

.02184 

.404 

-.1024 

.0243 

3.00 

1.00 

.04165 

.02199 

.343 

-.0222 

.1055 

2.00 

.04070 

.02204 

.369 

-.0233 

.1047 

4.00 

.00168 

.02814 

1.000 

-.0776 

.0810 

4.00 

1.00 

.03997 

.02178 

.376 

-.0232 

.1032 

2.00 

.03902 

.02184 

.404 

-.0243 

.1024 

3.00 

-.00168 

.02814 

1.000 

-.0810 

.0776 

Sensitivity  Analysis 

As  mentioned  above,  the  means  comparison  process  was  repeated  for  the  F-15 
using  an  M  value  of  0.0,  0.05,  0.15  and  0.20.  When  using  a  value  of  0.00  the  results 
(shown  in  Table  7  below)  did  not  change.  In  fact,  the  models  had  similar  distributions  as 


53 


well  as  means.  All  of  the  p-values  from  the  Dunnett  T3  test  were  1.000  and  indicate  that 


none  of  the  means  are  significantly  different.  This  should  not  be  surprising  because  when 
M=  0,  the  DeJong  model  essentially  turns  into  Wright’s  model  and  the  S-Curve  model 
turns  into  the  Stanford-B  model. 


Table  7:  F-15  Dunnett  T3  Test  (M=0.0) 


(1)  Model 

(J)  Model 

Mean 

95%  Confidence  Interval 

Difference  (l-J) 

Std.  Error 

Sig. 

Lower  Bound 

Upper  Bound 

1.00 

2.00 

-.00094 

.01299 

1.000 

-.0375 

.0356 

3.00 

.00000 

.01288 

1.000 

-.0363 

.0363 

4.00 

-.00111 

.01299 

1.000 

-.0377 

.0355 

2.00 

1.00 

.00094 

.01299 

1.000 

-.0356 

.0375 

3.00 

.00094 

.01299 

1.000 

-.0356 

.0375 

4.00 

-.00017 

.01309 

1.000 

-.0371 

.0367 

3.00 

1.00 

.00000 

.01288 

1.000 

-.0363 

.0363 

2.00 

-.00094 

.01299 

1.000 

-.0375 

.0356 

4.00 

-.00111 

.01299 

1.000 

-.0377 

.0355 

4.00 

1.00 

.00111 

.01299 

1.000 

-.0355 

.0377 

_  2.00 

.00017 

.01309 

1.000 

-.0367 

.0371 

3.00 

.00111 

.01299 

1.000 

-.0355 

.0377 

Using  an  incompressibility  factor  of  0.05  provided  slightly  differing  results.  The 
Kruskal-W allace  test  (shown  in  Appendix  C)  yields  a  p-value  of  0.000  indicating  that  the 
distributions  of  the  models  are  different  and  presents  the  possibility  that  the  means  may 
be  different.  When  comparing  the  descriptive  statistics  shown  below  in  Table  8,  the 
results  for  standard  deviation  display  that  the  variances  can  be  assumed  equal.  The 
largest  a  over  smallest  a  yields  a  value  of  1.69  which  is  less  than  two;  therefore,  the 
original  Dunnett  test  can  be  used. 
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The  results  of  the  Dunnett  test  holding  Model  1  (WLC)  as  the  control  are  shown 
below  in  Table  9.  Assuming  an  incompressibility  factor  of  0.05  both  the  DeJong  and  S- 
Curve  models  are  significantly  more  accurate  with  low  p-values  of  0.033  and  0.030 
respectively. 


Table  8:  F-15  Descriptive  Statistics  (M=0.05) 


N 

Mean 

Std.  Deviation 

Skewness 

Kurtosis 

Statistic 

Statistic 

Statistic 

Statistic 

Std.  Error 

Statistic 

Std.  Error 

WLC 

15 

.0987 

.03529 

1.426 

.580 

3.247 

1.121 

Stan_B 

15 

.0997 

.03584 

1.378 

.580 

3.134 

1.121 

DeJong 

15 

.0526 

.05983 

1.936 

.580 

3.057 

1.121 

S_Curve 

15 

.0520 

.05862 

1.952 

.580 

3.070 

1.121 

Valid  N  (listwise) 

15 

The  DeJong  Model  had  a  MAPE  value  of  5.26%  and  the  S-Curve  model  had  a  value  of 
5.20%,  both  of  which  were  the  two  smallest  MAPE  values  from  the  entire  study. 


Table  9:  F-15  Dunnett  Test  (M=0.05) 


(1)  Model 

(J)  Model 

Mean 

95%  Confidence  Interval 

Difference  (l-J) 

Std.  Error 

Sig. 

Lower  Bound 

Upper  Bound 

2.00 

1.00 

.00094 

.01784 

1.000 

-.0421 

.0440 

3.00 

1.00 

-.04616* 

.01784 

.033 

-.0892 

-.0031 

4.00 

1.00 

-.04670* 

.01784 

.030 

-.0898 

-.0036 

a.  Dunnett  t-tests  treat  one  group  as  a  control,  and  compare  all  other  groups  against  it. 
*.  The  mean  difference  is  significant  at  the  0.05  level. 


The  results  for  an  incompressibility  factor  of  0.05  are  shown  graphically  below  in 
Figure  15.  The  graph  shown  the  actual  vs  predicted  values  for  the  F-15E  model,  which 
accounts  for  the  last  5  lots  of  the  production  process.  The  WLC  and  Stanford-B  values 
essentially  fell  on  top  of  each  other,  and  the  same  was  seen  foe  the  DeJong  and  S-Curve 


models;  therefore  the  graph  only  shows  the  WLC  and  S-Curve  models  to  illustrate  how 
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the  incompressibility  factor  changes  the  estimate.  As  the  graph  indicates,  the  S-Curve 
predicted  values  fall  much  closer  to  the  actual  costs  resulting  in  a  MAPE  that  is  nearly 
4.5%  lower  than  WLC.  A  similar  graph  will  also  be  shown  for  M  =  0. 15,  to  illustrate 
when  large  incompressibility  values  result  in  a  less  accurate  estimate. 


o  WLC 

. O  S  Curve 

. O . Actual 

1200 

Figure  15:  F-15E  Predicted  vs.  Actual  (M=0.05) 

To  test  which  model  is  the  most  accurate,  a  paired  sample  t-test  test  was  used  to 
determine  if  there  was  any  significant  difference  between  DeJong  Model  and  the  S-Curve 
model.  Table  10  shows  the  results  of  the  t-test. 
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Table  10:  F-15  t-test  DeJong-S-Curve 


Paired  Differences 

t 

df 

Sig.  (2- 

tailed) 

Mean 

Std. 

Deviation 

Std.  Error 

Mean 

95%  Confidence 

Interval  of  the 

Difference 

Lower 

Upper 

Pair  DeJong  - 

1  S__Curve 

.00054 

.00211 

.00054 

-.00063 

.00171 

.991 

14 

.339 
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The  high  p-value  of  0.339  indicates  that  there  is  no  difference  between  the  two  models 
although  they  are  both  more  accurate  than  the  other  two  models. 

Repeating  the  process  for  an  M  value  of  0.15  again  produces  a  low  p-value  for  the 
Kraskal- Wallis  test  of  0.000  meaning  that  the  sample  distributions  are  different  (Shown 
in  Appendix  C).  The  next  step  was  to  determine  if  any  of  the  means  were  different  and  if 
so,  which  ones.  The  descriptive  statistics  shown  below  in  Table  11  indicate  that  the 
variances  are  unequal  with  a  value  of  2.36  when  comparing  the  largest  a  over  smallest  o. 
Therefore,  the  Dunnett  T3  test  must  be  used  to  compare  the  means. 


Table  11:  F-15  Descriptive  Statistics  (M=0.15) 


N 

Mean 

Std.  Deviation 

Skewness 

Kurtosis 

Statistic 

Statistic 

Statistic 

Statistic 

Std.  Error 

Statistic 

Std.  Error 

WLC 

15 

.0987 

.03529 

1.426 

.580 

3.247 

1.121 

Stan_B 

15 

.0997 

.03584 

1.378 

.580 

3.134 

1.121 

DeJong 

15 

.2491 

.08336 

.729 

.580 

1.917 

1.121 

S_Curve 

15 

.2473 

.08295 

.713 

.580 

1.906 

1.121 

Valid  N  (listwise) 

15 

The  results  of  the  Dunnett  T3  test  are  shown  below  in  Table  12.  The  results  verify  that  at 
least  one  of  the  models  has  a  significantly  different  mean  from  the  others  with  two  p- 
values  of  0.00 

In  this  case,  the  S-Curve  and  DeJong  models  are  significantly  different  with  p- 
values  of  0.000;  however,  they  were  less  accurate  than  the  WLC  with  MAPE  values  of 
24.7%  and  24.9%  respectively.  The  results  also  indicate  that  there  is  no  difference 
between  the  Stanford-B  and  WLC  models.  Figure  16  below  details  the  actual  and 
predicted  costs.  Unlike  Figure  15  above,  in  this  case  the  larger  incompressibility  factor 
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cuts  out  too  much  learning  and  the  S-Curve  estimate  rises  far  above  the  actual  values 


while  the  WLC  estimates  remain  the  same. 


Table  12:  12:  F-15  Dunnett  T3  Test  (M=0.15) 


(1)  Model 

(J)  Model 

Mean 

95%  Confidence  Interval 

Difference  (l-J) 

Std.  Error 

Sig. 

Lower  Bound 

Upper  Bound 

1.00 

2.00 

-.00094 

.01299 

1.000 

-.0375 

.0356 

3.00 

-.15035* 

.02337 

.000 

-.2185 

-.0822 

4.00 

-.14856* 

.02328 

.000 

-.2164 

-.0807 

2.00 

1.00 

.00094 

.01299 

1.000 

-.0356 

.0375 

3.00 

-.14941* 

.02343 

.000 

-.2176 

-.0812 

4.00 

-.14762* 

.02333 

.000 

-.2156 

-.0797 

3.00 

1.00 

.15035* 

.02337 

.000 

.0822 

.2185 

2.00 

.14941* 

.02343 

.000 

.0812 

.2176 

4.00 

.00179 

.03037 

1.000 

-.0838 

.0874 

4.00 

1.00 

.14856* 

.02328 

.000 

.0807 

.2164 

2.00 

.14762* 

.02333 

.000 

.0797 

.2156 

3.00 

-.00179 

.03037 

1.000 

-.0874 

.0838 

*.  The  mean  difference  is  significant  at  the  0.05  level. 


The  final  portion  of  the  sensitivity  analysis  was  to  test  the  means  assuming  an 
incompressibility  factor  of  0.20.  The  results  for  these  tests  were  the  same  as  assuming  an 
M  value  of  0.15  and  the  MAPE  values  for  the  DeJong  and  S-Curve  models  were  even 
higher  at  35.8%  and  35.7%  respectively.  These  results  are  shown  in  Appendix  D  and  not 
in  the  body  of  this  thesis  due  to  the  redundancy  from  the  earlier  results. 
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Figure  16:  F-15E  Predicted  vs.  Actual  (M=0.15) 


Conclusion 

The  purpose  of  this  chapter  was  to  provide  the  analytical  results  from  the  methods 
described  in  Chapter  III.  The  tables  and  charts  above  describe  test  results  for  both  the  F- 
15  using  a  range  of  incompressibility  assumptions  from  0.0  to  0.20.  The  results  varied  as 
the  value  of  the  assumed  incompressibility  factor  changed.  A  summary  chart  is  shown 
below  in  Table  13. 


Table  13:  F-15  Analysis  Summary 


M  =0.0 

M  =0.05 

M  =0.10 

M  =0.15 

M  =0.20 

WLC 

N/A 

N/A 

N/A 

N/A 

N/A 

Stanford-B 

X 

X 

X 

X 

X 

DeJong 

X 

- 

X 

+ 

+ 

S-Curve 

X 

- 

X 

+ 

+ 

X  indicates  model  is  not  significantly  di 

'ferentfrom  WLC 

(+)  indicates  model  is  statistically  less  accurate  than  WLC  (Higher  MAPE) 
(-)  indicates  model  is  statistically  more  accurate  than  WLC  (Lower  MAPE) 
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When  the  factor  was  held  at  0.0  or  0.1,  there  was  no  statistical  difference  between  the 


models  and  these  results  reject  all  of  the  hypothesis.  On  the  contrary,  when  the  factor  is 
held  at  0.05,  the  DeJong  and  S-Curve  models  are  more  accurate  and  these  findings 
support  all  three  of  the  hypothesis.  Chapter  V  will  delve  into  the  implications  of  the 
finding  above;  it  will  also  give  a  brief  description  of  the  assumptions  and  limitations  of 
the  study  and  areas  for  improvement.  Chapter  V  will  conclude  with  the  significance  of 
these  results  as  well  as  areas  of  future  research  and  possible  follow-on  research  topics. 
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V.  Conclusions  and  Recommendations 


Introduction 

The  purpose  of  this  thesis  was  to  determine  if  there  are  more  accurate  learning 
curve  models  than  the  conventional  models  currently  used  in  Defense  cost  estimating. 
Four  models  were  investigated  through  a  series  comparative  tests:  Wright’s  learning 
model  (used  as  the  status  quo),  the  Stanford-B  model,  DeJong’s  learning  formula,  and  the 
S-Curve  model.  The  raw  results  from  the  hypotheses  tests  are  shown  in  Chapter  IV  and 
Appendices.  Chapter  V  will  address  the  impacts  of  the  findings  and  the  effects  they  have 
on  the  research  questions.  The  following  section  will  examine  what  the  test  results 
indicate  about  each  of  the  four  models  and  if  any  conclusions  can  be  drawn  from  the  F-15 
with  regards  to  the  research  questions.  There  is  also  a  section  detailing  the  possible 
implications  at  the  Air  Force  and  DoD  level  and  how  the  results  may  indicate  a  way 
forward  in  DoD  methodology  as  a  whole.  The  limitations  of  the  study  will  also  be 
addressed  in  this  chapter  and  it  will  conclude  with  a  discussion  of  possible  follow-on 
research  recommendations  moving  forward. 

Conclusions  of  Research 

The  results  of  this  research  are  inconclusive  in  regards  to  answering  in  the 
overarching  research  question  of  whether  there  is  a  more  accurate  learning  curve  model 
available  for  DoD  use  than  Wright  original  formulation.  However,  the  results  do  provide 
some  insight  into  the  effects  of  learning  and  where  to  go  from  here.  The  findings  also 
emphasize  the  importance  if  incompressibility  in  the  learning  process.  Slight  changes  in 
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the  assumed  incompressibility  of  the  process  lead  to  drastically  different  results  as  to 
which  model  is  most  accurate.  This  significance  will  be  addressed  later  in  the  chapter. 

The  first  hypothesis  from  this  thesis  was  that  at  least  one  of  the  models  would 
have  a  MAPE  value  statistically  different  from  the  others.  This  was  not  the  case  when 
the  incompressibility  factor  was  assumed  to  be  0.0  or  0.1,  but  the  hypothesis  holds  for 
values  of  0.05,  0.15  and  0.20.  These  results  indicate  that,  although  not  uniformly,  there 
does  appear  to  be  evidence  that  there  is  a  statistical  difference  between  at  least  two  of  the 
models.  This  result  is  important  because  it  sets  up  the  framework  to  be  able  to  test  the 
other  hypotheses  in  the  study. 

The  second  hypothesis  was  that  at  least  one  model  would  have  a  MAPE  value 
statistically  lower  than  Wright’s  model.  This  hypothesis  only  held  when  the 
incompressibility  factor  was  assumed  to  be  0.05  and  in  all  of  the  other  cases;  there  was 
no  statistical  difference  at  0.1,  and  the  models  were  actually  less  accurate  than  Wright’s 
model  when  M  =  0.15  and  0.20.  This  finding  indicates  that  as  the  process  is  assumed  to 
be  more  automated,  Wright’s  curve  actually  performs  best.  These  results  clearly  do  not 
fully  support  the  second  hypothesis,  but  do  illustrate  potential  for  learning  curve 
improvement  if  an  actual,  universal  incompressibility  factor  is  found  to  be  somewhere 
between  0.0  and  0.1.  Post  hoc  analysis  found  that  the  S-Curve  and  DeJong  models 
switch  from  being  statistically  more  accurate  to  having  no  significant  difference  in 
MAPE  value  somewhere  between  0.05  and  0.06.  These  results  can  be  seen  in  Appendix 
E.  The  follow-on  research  section  will  provide  potential  impacts  of  a  statistically 
supported  incompressibility  factor  and  how  that  factor  could  potentially  support  the 
findings  from  these  results. 
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The  final  part  of  this  analysis  was  to  test  which  model  was  the  most  accurate 
between  the  four.  The  third  hypothesis  from  this  research  was  that  the  S -Curve  model 
would  be  the  most  accurate  because  it  accounts  for  the  slow  decline  in  performance  over 
time  due  to  forgetting.  As  with  the  second  hypothesis,  this  hypothesis  is  only  partially 
supported  when  the  incompressibility  factor  is  assumed  to  be  0.05,  and  rejected  by  the 
other  results.  At  0.05  both  the  DeJong  and  S-Curve  models  are  more  accurate  than 
Wright’s  model,  but  there  is  no  statistical  difference  between  the  two.  These  results  lead 
to  inconclusive  outcomes  about  which  model  is  best,  but  again  point  to  a  potential  area  of 
improvement  in  learning  curve  estimating  and  the  importance  of  incompressibility. 

The  findings  of  this  study  lead  to  two  additional  theoretical  questions:  why  were 
the  results  extremely  sensitive  to  the  incompressibility,  and  what  conclusions  can  be 
drawn  about  the  application  of  modem  learning  models  in  DoD  acquisitions.  While  the 
second  question  will  be  addressed  at  the  end  of  this  chapter,  the  first  question  may  be  due 
to  the  data  itself.  The  incompressibility  factor  essentially  represents  the  amount  of 
potential  learning  that  is  lost  for  each  unit  due  to  automated  production  processes.  If  an 
incompressibility  factor  is  .3,  then  only  70%  of  the  potential  learning  can  be  achieved. 
When  compounded  over  several  lots  and  units  (over  1000  units  for  the  F-15  A-E),  a  small 
shift  in  that  percentage  can  result  in  a  massive  change  in  the  cost  of  the  units  at  the  end  of 
the  production  process. 

This  sensitivity  affirms  the  need  for  additional  research  into  incompressibility 
factors  within  the  DoD  and  defense  contractors  in  general.  As  mentioned  earlier,  the 
production  of  an  aircraft  is  not  that  unlike  the  production  of  a  high  end  sports  car.  The 
level  of  precision  and  craftsmanship  required  eliminates  the  use  for  certain  automated 
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processes  that  may  be  present  in  an  assembly  line  at  Ford  or  Toyota.  Given  this  dynamic, 
assuming  the  real  incompressibility  factor  is  somewhere  between  0.0  and  0.1  is  not 
farfetched.  Follow  up  investigation  involving  inquiries  to  top  practitioners  in  the  learning 
curve  field,  including  Dr.  Badiru,  support  the  belief  that  the  percentage  of  automation  is 
very,  very  small.  Additionally,  different  defense  contractors  may  use  different 
production  processes  that  result  in  different  incompressibility  factors  and  thus  increase 
the  sensitivity  of  the  costs  to  those  factors.  This  is  yet  another  reason  for  future 
incompressibility  research  that  will  be  described  later  in  the  chapter. 

These  results  also  indicate  that  learning  is  affected  much  more  by 
incompressibility  than  prior  experience  units.  The  prior  experience  units  parameter  (B) 
was  the  differentiating  parameter  between  the  WLC  and  Stanford-B  model,  as  well  as  the 
difference  between  DeJong’s  learning  formula  and  the  S-Curve  model.  One  explination 
for  this  result  may  be  the  large  number  of  units  produced  for  the  F-15.  When  examining 
over  1100  units,  a  change  to  a  mere  ten  of  the  units  will  have  a  very  limited  impact  on  the 
outcome.  However,  if  the  same  prior  experience  units  factor  were  applied  to  a  smaller 
production  line  such  as  the  B-2  bomber,  the  difference  may  become  very  significant.  In 
all  five  cases,  the  there  was  no  statistical  difference  between  the  model  and  its  close 
relative,  meaning  that  the  maximum  change  in  B  of  10  had  no  impact  on  the  long  term 
estimates  of  the  models.  Therefore,  it  is  safe  to  assume  that  simply  adding  a  prior 
experience  units  factor  alone  provides  no  value  to  the  estimate  is  the  production  number 
is  high,  but  the  interaction  between  prior  units  and  incompressibility  could  be  very 
significant. 
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Significance  of  Research 

The  results  above  indicate  that  there  is  potential  for  a  more  accurate  model  in 
predicting  the  effects  of  learning  within  DoD  acquisitions.  This  study  was  unique  in  two 
primary  areas.  First,  it  investigated  Defense  aircraft  costs  where  past  studies  had 
primarily  investigated  commercial  aircraft  or  component  parts,  and  second,  due  to  the 
nature  of  DoD  cost  estimating,  it  examines  costs  from  an  external  perspective  rather  than 
internal  and  therefore  the  availability  and  accuracy  of  data  may  lead  to  more  assumptions 
than  prior  studies. 

Despite  these  intricacies,  a  few  major  conclusions  can  be  drawn  from  the  results. 
The  first  is  that  there  is  potential  with  two  of  the  alternative  learning  curve  models  to 
increase  estimate  accuracy  using  learning  curves  by  up  to  5%  over  the  entire  production 
cycle  based  upon  the  results  for  an  incompressibility  factor  of  0.05.  Post  hoc  analysis 
indicated  that  the  largest  difference  between  the  Wright  and  S-Curve  models,  just  over 
5.2%,  was  seen  at  0.04  (these  results  can  also  be  seen  in  Appendix  E).  While  this 
percentage  may  seem  small,  for  the  $20B+  production  cycle  of  the  F-15  A-E  airframes, 
this  percentage  could  result  in  a  savings  of  over  $1B  just  by  changing  one  estimating  tool. 
This  thesis  does  not  go  so  far  as  to  say  current  cost  estimating  methodology  is  wrong; 
cost  estimates  are  just  that,  estimates.  This  research  suggests  and  hopes  to  provide  the 
foundation  for  ways  to  improve  current  learning  curve  methodology.  Which  model 
should  be  used  is  an  area  that  requires  more  analysis.  Thus  far,  the  S-Curve  and  DeJong 
models  appear  to  be  worthy  candidates.  Further  analysis  incorporating  incompressibility 
could  reveal  more  information  related  to  the  application  of  the  S-Curve  and  DeJong 
models  and  consequently,  the  theory  of  forgetting  within  DoD  methodology. 
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While  the  findings  of  this  study  do  not  support  all  of  the  hypotheses  of  this 
research  or  indicate  which  model  is  the  best  predictor  of  future  costs,  they  do  open  up  a 
dialogue  for  future  change  in  DoD  acquisition  methodology.  These  results  stress  the 
importance  of  incompressibility  in  learning  and  the  potential  for  improvement  based  on 
that  significance.  Future  research  into  incompressibility  in  aircraft  production  and 
comparative  research  into  additional  airframes  as  well  as  any  of  the  dozens  of  other 
learning  models  available  may  help  provide  decision  makers  with  additional  information 
and  hopefully  increase  the  accuracy  of  cost  estimates  as  a  whole. 

Assumptions  and  Limitations 

As  always,  there  are  limitations  to  this  research  and  the  methods  used  to  test  the 
hypotheses.  In  addition  to  the  limitations,  there  were  some  threats  to  external  validity 
identified.  One  of  those  threats  is  the  type  of  aircraft  used  in  the  analysis.  It  may  prove 
that  different  types  of  aircraft  provide  different  results  and  that  one  model  may  be  more 
accurate  for  fighters  but  provide  results  that  are  non-significant  for  cargo  aircraft.  This 
research  began  by  applying  the  methods  only  to  fighter  aircraft  and  open  up  the  door  for 
other  researchers  to  expand  the  theory  into  other  platforms  and  domains.  However, 
dividing  aircraft  data  into  categories  may  spread  an  already  small  sample  size  too  thin. 

One  major  limitation  to  this  study  was  the  amount  of  data  that  was  available  to 
analyze.  While  the  results  of  the  analysis  prove  to  be  inconclusive,  the  data  presented  in 
this  analysis  is  only  a  small  fraction  of  all  aircraft  programs  and  an  even  smaller  portion 
of  DoD  programs  as  a  whole.  AFLCMC/FCZ  only  has  access  to  programs  under  their 
control,  and  only  data  from  those  programs  which  reported  on  learning  curves.  These 
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factors  will  limit  the  number  of  aircraft  available  for  future  analysis.  A  larger  data-set 
would  have  been  preferred,  but  in  this  case  the  sample  was  limited  to  the  data  available 
and  adding  one  or  two  additional  aircraft  did  not  improve  the  validity  of  the  results  given 
the  inconclusive  nature  of  the  results.  Follow  on  analysis  of  incompressibility  and 
additional  Air  Force  and  DoD  programs  is  necessary  before  generalization  of  the  findings 
can  be  made. 

Another  limitation  is  the  accuracy  of  the  data  reported  as  actual  costs.  The 
accuracy  or  lack  thereof  in  updating  actual  values  for  estimates  has  long  been  an  issue  in 
DoD  and  has  just  recently  been  brought  to  light  in  an  effort  to  clean  up  data  repositories. 
However,  the  fact  that  many  of  the  programs  are  under  AFLCMC/FCZ  local  control  and 
span  over  multiple  decades  should  help  to  mitigate  some  of  the  uncertainty  of  the  results. 
An  additional  assumption  was  using  the  lot  plot  point  with  the  cumulative  average  theory. 
Lot  data  is  often  used  in  DoD  cost  estimates  due  to  the  nature  of  contractor  reports,  but 
that  type  of  analysis  has  not  been  applied  to  the  additional  models  used  in  this  analysis. 
However,  the  methods  used  were  backed  up  by  the  Air  Force  Cost  Analysis  Handbook  as 
well  as  other  studies  into  learning  curves.  This  methodology  in  addition  to  the  fact  that 
lot  data  is  widely  used  throughout  the  DoD,  should  reduce  the  effect  the  lot  plot  point 
assumption  has  on  the  results  while  at  the  same  time  may  make  them  more  generalizable 
to  individual  unit  data. 

Recommendations  for  Future  Research 

This  research  answered  several  questions  about  the  effects  of  learning  in  DoD,  but 
there  are  still  more  questions  that  need  to  be  addressed.  This  research  sought  to 
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determine  if  any  alternative  learning  models  are  more  accurate  than  Wright’s  model, 
which  is  commonly  used  throughout  Defense  acquisition  programs  today.  This  study 
took  steps  toward  accomplishing  that  goal  and  found  that  the  S-Curve  and  DeJong 
models  may  be  more  accurate  if  the  incompressibility  factor  for  aircraft  production  is 
found  to  be  between  0.0  and  0.5.  However,  the  evidence  is  inconclusive  as  to  which 
model  is  the  most  accurate  and  whether  or  not  the  incompressibility  assumption  above  is 
valid.  Future  research  should  look  to  expand  upon  these  findings  to  determine  which  of 
these  models,  or  any  additional  models,  is  the  most  accurate. 

Additional  research  into  impressibility  factors  would  prove  valuable  to  this 
learning  curve  analysis  and  paramount  to  any  additional  research  using  these  models.  As 
mentioned  earlier,  one  of  the  major  assumptions  from  this  study  was  using  an 
incompressibility  range  from  0.0  to  0.2.  Future  research  into  what  incompressibility 
factor  should  be  used  for  aircraft  production  would  provide  insight  into  which  models 
may  be  more  appropriate  and  also  provide  further  insight  into  the  validity  of  these  results. 
Also,  analysis  into  how  incompressibility  factors  change  with  different  Defense 
contractors  or  how  different  platform  types  affect  the  production  process  could  provide 
even  more  accuracy  in  this  and  future  findings  .  Clarifying  these  uncertainties  will  help 
produce  more  accurate  and  useful  cost  estimates  using  the  models  described  above. 

Once  a  defendable  and  accurate  incompressibility  factor  can  be  found,  future 
research  should  also  look  to  broaden  the  scope  of  the  programs  used  in  the  analysis.  This 
research  focused  on  fighter  aircraft  and  the  initial  pool  of  six  was  trimmed  down  to  one 
aircraft.  Follow  on  studies  should  attempt  to  incorporate  the  findings  to  additional 
platforms  such  as  bombers,  cargo/tanker,  and  unmanned  aircraft.  Also,  the  use  of 
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additional  models  that  do  not  rely  on  the  incompressibility  factor  would  provide  more 
robust  results.  Results  from  the  analysis  of  the  F-15  should  not  necessarily  be 
generalized  to  all  aircraft  as  a  whole.  Further  analysis  may  shed  light  into  which  models 
perform  best  on  which  aircraft  or  if  there  is  a  single  model  that  can  be  generalized  to  all 
platforms. 

Summary 

When  this  research  began,  the  goal  was  to  find  out  if  a  more  accurate  learning 
curve  model  than  what  is  currently  used  in  DoD  exists.  The  AFLCMC  cost  staff 
supported  the  effort  to  find  a  way  to  improve  current  learning  curve  methodology  in 
Defense  acquisitions.  Through  the  efforts  of  this  thesis  and  the  findings  entailed  within, 
there  is  evidence  to  support  the  hypothesis  that  at  least  one  of  the  models  may  be  more 
accurate  than  Wright’s  original  model.  This  research  found  that  both  the  DeJong  and  S- 
Curve  models  are  statistically  more  accurate  than  the  status  quo  given  the 
incompressibility  factor  is  somewhere  between  0.0  and  0.5.  However,  if  the  factor  is 
assumed  to  be  .01  or  higher,  then  Wright’s  model  is  the  most  accurate  and  the  additional 
models  do  not  improve  on  the  current  methodology.  The  results  as  to  which  model  is  the 
most  accurate  are  inconclusive  and  do  not  support  nor  disprove  the  hypothesis  that  the  S- 
Curve  model  is  the  most  accurate  of  the  four.  At  a  minimum,  this  thesis  provides  the 
foundation  for  further  research  into  additional  types  of  aircraft  as  well  as  an  applicable 
impressibility  factor  that  may  indicate  which  model  is  the  most  accurate  and  then  the 
alternative  models  can  be  considered  for  DoD  methodology. 
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The  argument  behind  this  thesis  is  that  the  current  DoD  learning  curve 
methodology  using  Wright’s  75+  year  old  model  should  not  be  accepted  as  the  status  quo 
for  the  sake  of  simplicity  or  nostalgia.  If  a  more  accurate  learning  model  exists  that  can 
be  applied  to  cost  estimating  within  the  Defense  department,  it  should  be  investigated  and 
analyzed.  While  the  results  of  this  thesis  are  inconclusive  in  regards  to  which  model  may 
be  the  best,  they  do  illustrate  the  point  that  there  are  additional  models  available  that  are 
more  accurate  in  certain  cases  as  well  as  provide  the  foundation  for  future  research  in 
Defense  Acquisitions,  which  can  hopefully  increase  the  accuracy  and  reliability  of  cost 
estimates  and  create  a  more  efficient  use  of  government  funding. 
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Appendix  A 


F-15  Unit  Theory  Log-Log  Regression 


3.0  3.5  4.0  4.5  5.0  5.5  6.0  6.5 
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Frequency  Frequency 


Appendix  B 


WLC 


Mean  =  0.1 0 
Std.  Dev.  =  0.035 
N  =  15 


Stan  B 


Mean  =  0.1 0 
Std.  Dev.  =  0.036 
N  =  15 
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Frequency  Frequency 


DeJong 


Mean  =  0.14 
Std.  Dev.  =  0.077 
N  =  15 


S  Curve 


Mean  =  0.14 
Std.  Dev.  =  0.077 
N  =  15 


73 


Appendix  C 


M  =  0.0 


Hypothesis  Test  Summary 


Null  Hypothesis 

Test 

Sig. 

Decision 

4  The  distribution  of  APE  is  the  same 
across  categories  of  Model. 

Independent- 

Samples 

Kruskal- 
Wallis  Test 

.911 

Retain  the 
null 

hypothesis. 

Asymptotic  significances  are  displayed. 

The  significance  level  is  .05. 

0.05 

Hypothesis  Test  Summary 

Null  Hypothesis 

Test 

Sig. 

Decision 

4  The  distribution  of  APE  is  the  same 
across  categories  of  Model. 

Independent- 

Samples 

Kruskal- 
Wallis  Test 

.000 

Reject  the 
null 

hypothesis. 

Asymptotic  significances  are  displayed. 

The  significance  level  is  .05. 

0.15 

Hypothesis  Test  Summary 

Null  Hypothesis 

Test 

Sig. 

Decision 

4  The  distribution  of  APE  is  the  same 
across  categories  of  Model. 

Independent- 

Samples 

Kruskal- 
Wallis  Test 

.000 

Reject  the 
null 

hypothesis. 

Asymptotic  significances  are  displayed. 

The  significance  level  is  .05. 

0.20 

Hypothesis  Test  Summary 

Null  Hypothesis 

Test 

Sig. 

Decision 

4  The  distribution  of  APE  is  the  same 
across  categories  of  Model. 

Independent- 

Samples 

Kruskal- 
Wallis  Test 

.000 

Reject  the 
null 

hypothesis. 

Asymptotic  significances  are  displayed.  The  significance  level  is  .05. 
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Appendix  D 


F-15  Descriptive  Statistics  (M  =  0.20) 


N 

Mean 

Std.  Deviation 

Skewness 

Kurtosis 

Statistic 

Statistic 

Statistic 

Statistic 

Std.  Error 

Statistic 

Std.  Error 

WLC 

15 

.0987 

.03529 

1.426 

.580 

3.247 

1.121 

Stan_B 

15 

.0997 

.03584 

1.378 

.580 

3.134 

1.121 

DeJong 

15 

.3584 

.08824 

.563 

.580 

1.765 

1.121 

S_Curve 

15 

.3568 

.08788 

.547 

.580 

1.754 

1.121 

Valid  N  (listwise) 

15 

F-15  Dunnett  T3  Test  ( M  =  0.20) 


(1)  Model 

(J)  Model 

Mean 

95%  Confidence  Interval 

Difference  (l-J) 

Std.  Error 

Sig. 

Lower  Bound 

Upper  Bound 

1.00 

2.00 

-.00094 

.01299 

1.000 

-.0375 

.0356 

3.00 

-.25972* 

.02454 

.000 

-.3314 

-.1880 

4.00 

-.25804* 

.02445 

.000 

-.3295 

-.1866 

2.00 

1.00 

.00094 

.01299 

1.000 

-.0356 

.0375 

_  3.00 

-.25877* 

.02459 

.000 

-.3306 

-.1870 

4.00 

-.25709* 

.02451 

.000 

-.3286 

-.1855 

3.00 

1.00 

.25972* 

.02454 

.000 

.1880 

.3314 

_  2.00 

.25877* 

.02459 

.000 

.1870 

.3306 

4.00 

.00168 

.03216 

1.000 

-.0889 

.0923 

4.00 

1.00 

.25804* 

.02445 

.000 

.1866 

.3295 

2.00 

.25709* 

.02451 

.000 

.1855 

.3286 

3.00 

-.00168 

.03216 

1.000 

-.0923 

.0889 

*.  The  mean  difference  is  significant  at  the  0.05  level. 
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Appendix  E 


Results  for  M  =  0.06 


Descriptive  Statistics 


Minimu 

Maximu 

Std. 

N 

m 

m 

Mean 

Deviation 

Skewness 

Kurtosis  I 

Std. 

Std. 

Statistic 

Statistic 

Statistic 

Statistic 

Statistic 

Statistic 

Error 

Statistic 

Error 

WLC 

15 

.0530 

.1958 

.098720 

.0352870 

1.426 

.580 

3.247 

1.121 

StanB 

15 

.0509 

.1976 

.099662 

.0358351 

1.378 

.580 

3.134 

1.121 

DeJong 

15 

.0046 

.2342 

.063668 

.0652121 

1.765 

.580 

2.962 

1.121 

SCurve 

15 

.0036 

.2310 

.062168 

.0645087 

1.755 

.580 

2.933 

1.121 

Valid  N 

15 

(listwise) 

Multiple  Comparisons 


AbsPE 

Dunnett  t  (2-sided)a 


(1)  ModelType 

(J)  ModelType 

Mean 

95%  Confidence  Interval 

Difference  (l-J) 

Std.  Error 

Lower  Bound 

2 

1 

.0009424 

.0190991 

1.000 

-.045171 

.047056 

3 

1 

-.0350517 

.0190991 

.174 

-.081165 

.011061 

4 

1 

-.0365514 

.0190991 

.149 

-.082665 

.009562 

a.  Dunnett  t-tests  treat  one  group  as  a  control,  and  compare  all  other  groups  against  it. 
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Results  for  M  =  0.04 


Descriptive  Statistics 


Minimu 

Maximu 

Std. 

N 

m 

m 

Mean 

Deviation 

Skewness 

Kurtosis  1 

Std. 

Std. 

Statistic 

Statistic 

Statistic 

Statistic 

Statistic 

Statistic 

Error 

Statistic 

Error 

WLC 

15 

.0530 

.1958 

.098720 

.0352870 

1.426 

.580 

3.247 

1.121 

StanB 

15 

.0509 

.1976 

.099662 

.0358351 

1.378 

.580 

3.134 

1.121 

DeJong 

15 

.0045 

.1871 

.047245 

.0558451 

1.700 

.580 

1.903 

1.121 

SCurve 

15 

.0030 

.1837 

.046700 

.0553423 

1.635 

.580 

1.702 

1.121 

Valid  N 

15 

(listwise) 

Multiple  Comparisons 


AbsPE 

Dunnett  t  (2-sided)a 


(1)  ModelType 

(J)  ModelType 

Mean 

95%  Confidence  Interval 

Difference  (l-J) 

Std.  Error 

Sig. 

Lower  Bound 

Upper  Bound 

2 

1 

.0009424 

.0170399 

1.000 

-.040199 

.042084 

3 

1 

-.0514745* 

.0170399 

.011 

-.092616 

-.010333 

4 

1 

-.0520198* 

.0170399 

.010 

-.093161 

-.010879 

a.  Dunnett  t-tests  treat  one  group  as  a  control,  and  compare  all  other  groups  against  it. 
*.  The  mean  difference  is  significant  at  the  0.05  level. 
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